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potential  of  providing  efficient  estimates  of  the  Class  A  parameters  for  small  sample  sizes  is  pro¬ 
posed.  For  the  single-parameter  estimation  problem,  it  is  shown  that  the  sequence  of  estimates 
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1.  INTRODUCTION 

Communication  systems  are  seldom  interfered  with  by  white  Gaussian  noise  alone,  yet  receiv¬ 
ing  systems  in  general  use  are  those  which  are  optimum  for  white  Gaussian  noise.  The  man-made 
electromagnetic  environment,  and  much  of  the  natural  one.  is  basically  "impulsive,"  i.e.,  it  has  a 
highly  structured  form  characterized  by  significant  probabilities  of  large  interference  levels.  In 
addition  to  the  man-made  electromagnetic  environment,  there  many  other  different,  common  and 
widely-used  communications  and  remote-sensing  type  channels  where  impulsive  noise  dominates, 
e.g..  extra-low-frequency  (ELF)  channels,  urban  radio  networks,  underwater  acoustic  channels, 
and  telephone  line  channels.  This  is  in  contrast  to  the  Gaussian  noise  processes  inherent  in  transmit¬ 
ting  and  receiving  elements.  Since  the  conventional  receivers  are  effectively  linear,  the  impulsive 
character  of  the  interference  can  drastically  degrade  the  performance  of  conventional  systems.  In 
fact,  it  has  been  well  established  [l]-[5]  that  the  performance  of  communications,  radar,  and  sonar 
systems  operating  in  impulsive  channels  can  be  greatly  enhanced  if  the  true  statistics  of  the  channel 
are  known  and  exploited.  Consequently,  the  problem  of  identifying  impulsive  noise  channels  is  an 
important  one  in  the  development  of  systems  that  can  approach  the  performance  limits  set  by  such 
channels.  To  do  so.  one  first  needs  to  develop  a  model  for  the  interference  that  fits  available  meas¬ 
urements.  is  physically  meaningful  when  the  nature  of  the  noise  sources,  their  distributions  in  time 
and  space,  propagation,  etc.,  are  taken  into  account,  is  directly  relatable  to  the  physical  mechanisms 
giving  rise  to  the  interference,  and  is  tractable  for  signal  detection  problems. 

A  physically-meaningful  and  widely-used  model  .  or  impulsive  interference  that  satisfies  the 
above  requirements  is  the  so-called  Class  A  Middleton  model  [6]-[8].  This  model  is  parametric, 
with  parameters  that  can  be  adjusted  to  fit  a  great  variety  of  non-Gaussian  noise  phenomena  arising 
in  practice.  The  parametric  nature  of  this  model  makes  it  amenable  to  identification  through  point 
estimation  techniques.  Furthermore,  this  model,  which  features  a  non-Gaussian  impulsive  com¬ 
ponent,  superimposed  on  a  Gaussian  background  component,  has  found  wide  application  in  several 
problems  of  practical  interest  (see.  e.g..  [5],[9]).  A  complete  description  of  Middleton's  Class  A 
noise  model,  including  its  derivation,  further  motivation,  and  taxonomy,  can  be  found  in  [5].[6],[8]. 
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This  study  is  devoted  to  the  problem  of  obtaining  global  optimal  and  near-optimal 
identification  procedures  for  the  Class  A  interference  model.  The  problem  of  estimating  the  param¬ 
eters  of  the  Class  A  model  was  first  considered  by  Middleton  in  [10]  and  [ll].  In  these  studies. 
Middleton  outlines  three  methods  for  determining  the  model  parameters.  The  first  is  an  empirical 
procedure  whereby  the  parameters  are  obtained  graphically  from  the  experimentally  determined 
distribution  function.  Expressions  for  the  parameters  in  terms  of  the  moments  of  the  Class  A  pro¬ 
bability  distribution  function  are  given  in  the  second  method.  The  third  procedure  requires  that 
experimental  values  of  the  distribution  function  ar.d  its  slope  at  vanishingly  small  thresholds  be 
available  The  parameters  are  then  determined  via  two  relations  involving  these  measurements. 
Other  work  on  the  Class  A  estimation  problem  includes  that  of  Powell  and  Wilson  [12].  wherein 
standard  batch  estimation  techniques  are  used  to  estimate  the  parameters  for  a  restricted  range  of 
parameter  values. 

We  begin  this  study  with  an  overview  of  the  Class  A  interference  model,  which  is  given  in 
Chapter  2.  In  Chapter  3.  the  problem  of  basic  batch  estimation  of  the  Class  A  parameters  from  an 
independent  sequence  of  Class  A  samples  is  considered.  In  particular,  within  the  context  of  batch 
estimation,  our  goal  is  two-fold  :  (i)  to  obtain  a  consistent  and  asymptotically  efficient  estimator  of 
the  parameters  and.  (ii)  to  obtain  a  practical  estimator  of  these  parameters  which  performs  well  for 
moderate  sample  sizes.  The  problem  of  recursive  identification  of  the  Class  A  parameters  is 
addressed  in  Chapter  4.  Our  objective  here  is  to  obtain  a  global  recursive  estimator  of  the  parame¬ 
ters  which  performs  well  for  all  parameter  vectors  in  the  parameter  set  of  interest.  In  Chapter  5, 
we  develop  an  efficient  estimator  of  the  paramters  with  good  small-sample-size  performance  glo¬ 
bally  A  summary  of  the  research  results  is  given  in  Chapter  6. 
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2.  HE  CLASS  A  INTERFERENCE  MODEL 

In  this  study,  we  will  focus  our  attention  on  the  model  defined  in  [8]  as  the  "strictly  canoni¬ 
cal"  Class  A  interference  model.  An  overview  of  the  model  will  be  given  in  this  section.  Further 
details  of  this  model  can  be  found  in  [6]-[8], 

In  Middleton's  strictly  canonical  Class  A  noise  model,  the  received  interference  is  assumed  to 
be  a  process  having  two  independent  components  : 

z(f)  =  z?(0  +  zc(f)  . 

The  first  component.  Zp  ,  is  represented  by 

Z?(r)  =  £f/,(r.y)  . 

j 

where  Uj  denotes  the  j  -th  waveform  from  an  interfering  source  and  y  represents  a  set  of  random 
parameters  which  describes  the  waveform  scale  and  structure.  It  is  next  assumed  that  only  one 
type  of  waveform.  U ,  is  generated,  with  variations  in  the  individual  waveforms  accounted  for  by 
appropriate  statistical  treatment  of  the  parameters  in  y,  and  the  generic  waveform  U  (f  )  is  obtained 
explicitly  from  the  underlying  physical  mechanisms  [6].  Under  the  additional  assumption  that  the 
sources  emit  their  waveforms  independently  according  to  the  Poisson  distribution  in  time,  the 
first-order  characteristic  function  for  ZP  is  given  by  (see,  e.g.,  [7]) 

F(i  £)p  =  exp  [  <AJ0  (Bg  —  A  >  ] 

where  Ba  denotes  the  envelope  of  U  when  U  is  written  in  envelope  and  phase  form,  Ja  is  the  Bessel 

function  of  order  zero,  and  <•>  denotes  required  sta  stical  averages  over  the  random  epoch 

representing  the  time  at  which  the  typical  j  -th  source  emits,  Doppler  velocities  (if  any),  and  the 

random  signal  parameters  in  y.  The  second  component,  ZG ,  is  an  additive  stationary  Gaussian 

background  process  attributable  either  to  receiver  noise  or  to  the  limit  of  a  high  density  Poisson 

process  representing  the  contributions  of  unresolvable  background  source^,  or  both.  Hence,  under 

2 

the  assumption  that  this  background  component  has  zero  mean  and  variance  c tg  ,  its  first-order 
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characteristic  function  is 


FbOa  =e 


-i2<r^n 


and  the  overall  characteristic  function  for  the  process  is  then  given  by 

which  can  be  approximated  canonically  as  follows  [7],[8]: 


FbOp+a  =e  Z  -  e 

m ! 


h2n 


(2.1) 


m  =0 


where 

cn~  m  <B*>/2  +  cTq  . 

and  A  is  a  positive  parameter  to  be  defined  below.  For  computational  purposes,  it  is  convenient  to 
consider  the  normalized  variable 

Z  =  XK<Xo>  +  <*/>)*. 

Transforming  (2.1)  for  the  normalized  variable  Z  yields  the  desired  probability  density  function 
(pdf): 

00  A  m  2m  2 

r  \  —A  x*  A  -*  n*m  .  . 

pziz)-e  £  - — -  e  •  (2.2) 

m  =0  m!V27TCTm 

where 


m 


+  r 


A 

i  +  r 


and  where  T  is  a  second  paiameter  (also  to  be  defined  below).  The  corresponding  Class  A  envelope 
pdf  is  given  by 
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w(z)  = 


2e~A  Z 

m  =0 


Am 
m. !  <y* 


z  e 


z  ^0 


0 


z  <0 


(2.3) 


It  is  the  envelope  statistics  which  will  be  used  in  the  estimation  problem  addressed  here.  Note  that 
this  envelope  pdf  consists  of  an  infinite  mixture  of  weighted  Rayleigh  densities.  The  m  —  0  term  is 
attributable  to  the  background  component  of  the  input  interference.1  whereas  all  terms  indexed  by 
m  ^  1  are  attributable  to  the  impulsive  (Poisson)  component  of  the  input  interference  plus  an 
appropriate  contribution  from  the  background  component  of  the  interference. 

The  two  basic  and  traditional  parameters  of  the  model  are  A  and  T.  Let  us  consider  their 
definitions  and  physical  significance: 

i)  A  is  the  "Overlap  Index"  or  "Nonstructure  Index."  Specifically. 

A  A  vf,  (2.4) 

where  v  is  the  average  number  of  emission  events  impinging  on  the  receiver  per  second  and  Ts  is 
the  mean  duration  of  a  typical  interfering  source  emission.  Note  that  v  is  simply  the  rate  of  the 
Poisson  process  underlying  the  impulsive  part  of  the  interference.  Thus.  A  is  a  measure  of  the 
amount  of  temporal  overlap  among  the  interfering  signals.  The  smaller  A  is.  the  fewer  the  number 
of  emission  "events"  and/or  their  durations  so  that  the  (instantaneous)  noise  properties  are  dom¬ 
inated  by  the  waveform  characteristics  of  individual  events.  As  A  is  made  larger,  the  noise 
becomes  less  structured,  i.e.,  the  statistics  of  the  instantaneous  amplitude  approach  the  Gaussian 
distribution  (asymptotically  as  A  ->oo,  although  A  =  10  is  considered  a  large  value  for  A  ). 


'Note  that  <r0J  =  T/(l  +  D  .  It  follows  from  the  definition  of  T  (given  in  (2.3))  that  crJ(r/(l  +  D)  is  simply  the  inten¬ 
sity  of  the  Gaussian  component  of  the  input  interference.  Thus,  even  though  the  m  ■  0  term  appears  to  depend  on  the  Class 
A  model  parameters,  via  some  simple  manipulations,  it  can  be  seen  that  the  only  quantity  it  actually  depends  on  is  the  inten¬ 
sity  of  the  Gaussian  component  of  the  input  interference.  Consequently,  the  m  =0  term  is  entirely  attributable  to  this 
background  component. 
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ii)  T  is  called  the  "Gaussian  factor."  It  is  the  ratio  of  the  intensity  of  the  independent  Gaussian 
compo^-nt  of  the  input  interference.  crG .  to  the  intensity  of  the  non-Gaussian  component,  i.e., 

2 

o-G 

T4  <ZG>/<Zp>  =  -  where  fl^AA  <B02>/ 2.  (2.5) 

&2A 

By  adjusting  the  parameters  A  and  T.  the  density  in  (2.2)  can  be  made  to  fit  a  great  variety  of 
non-Gaussian  noise  densities.  In  particular,  the  Class  A  model  is  appropriate  for  interference 
caused  by  intentionally  radiated  signals  (e.g..  as  in  the  crowded  HF  band)  and  has  also  found  con¬ 
siderable  application  in  various  acoustical  (e.g..  sonar)  problems.  Examples  of  Class  A  interference 
include  (depending  on  the  receiver  bandwidth)  the  emissions  of  various  man-made  devices  such  as 
radio  frequency  dielectric  heaters,  soldering  machines,  plastic  welders,  etc.,  as  well  as  natural 
phenomena  such  as  grinding  arctic  ice  plates.  Typical  values  for  the  parameters  include 
(A  =  10  ,r  =  50)  for  narrowband  interference  from  ore-crushing  machinery  and 
(A  =  0.35,  T  =  5.0  X  10-4)  for  power-line  emissions. 

Although  A  and  T  are  the  traditional  parameters  of  the  Class  A  model,  instead  of  estimating 
A  and  T.  we  will  consider  the  problem  of  estimating  A  and  K ,  where 

K& AT  .  (2.6) 

i.e., 

K  -  2  <rG/ <B2> . 

This  reparametrization  proves  useful  since  it  increases  the  analytical  tractability  of  the  estimation 
problem.  Finally,  throughout  this  study,  where  specific  values  of  A  and  K  are  considered,  we  will 
restrict  our  attention  to  the  parameter  set 

AAKA  JC?€X2:  10~2$A  <1  and  10"*<if  10*2} 

since  this  is  the  range  of  usual  practical  interest  for  these  parameters  (see,  e.g.,  [9].[10].) 


7 


3.  BASIC  BATCH  ESTIMATION 


3.1.  Introduction 

In  this  chapter,  we  will  consider  the  problem  of  basic  batch  estimation  of  the  Class  A  parame¬ 
ters  from  an  independent  sequence  of  Class  A  envelope  samples.  Within  the  context  of  batch  esti¬ 
mation.  our  goal  is  two-fold:  (i)  the  first  goal  is  to  provide  an  asymptotically  optimal  estimator  of 
the  parameters  of  Middleton's  strictly  canonical  Class  A  noise  model:  and  Cii)  the  second  goal  is  to 
provide  a  practical  estimator  for  these  parameters  that  performs  well  for  moderate  sample  sizes. 

The  starting  point  in  this  study  is  an  estimator  proposed  in  [10]  based  on  the  method  of 
moments,  in  which  parameter  estimates  are  chosen  to  make  population  moments  agree  with  sample 
moments.  In  Section  3.2  we  provide  an  analysis  of  the  asymptotic  performance  of  this  estimator. 
We  show  that,  although  this  estimator  is  strongly  consistent,  its  asymptotic  variance  for  one 
parameter  can  be  unacceptably  high  due  to  the  insensitivity  of  population  moments  to  this  parame¬ 
ter.  We  then  turn,  in  Section  3.3.  to  the  problem  of  asymptotically  efficient  estimation.  We  first 
analyze  the  estimation  potential  in  the  Class  A  model  by  considering  the  inverse  of  Fisher's  infor¬ 
mation  matrix  for  the  model  in  the  parameter  ranges  of  practical  interest.  It  is  seen  from  this 
analysis  that  the  Class  A  model  can,  in  fact.  afFord  good  estimates  of  all  of  its  parameters  if 
efficiency  can  be  achieved. 

We  then  consider  two  estimators  that  can  achieve  efficiency.  The  first  of  these  is  maximum- 
likelihood.  which  proves  to  be  unwieldy  due  to  root  multiplicity  problems  in  the  likelihood  equa¬ 
tion  and  to  poorly  behaved  gradients.  The  second  estimator  is  one  which  corrects  these  two 
difficulties  by  initiating  likelihood  search  with  the  method-of-moments  estimates  considered  in  Sec¬ 
tion  3.2.  Because  of  the  consistency  of  the  moments  estimator,  this  second  estimator  retains  the 
efficiency  of  maximum-likelihood  without  the  associated  computational  problems.  Unfortunately, 
simulations  show  that  this  estimator  does  not  perform  well  for  moderate  sample  sizes  for  most 
parameter  values  of  interest  due  to  the  extremely  low  efficiency  of  the  initiating  estimator  at  these 
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parameter  values.  Thus,  although  this  estimator  is  attractive  irom  a  theoretical  viewpoint,  its  use 
as  a  practical  estimator  is  limited.  However,  as  we  show  in  Section  3.4,  its  basic  feature  of  doing 
likelihood  iteration  to  improve  an  initial  estimator  can  be  used  to  develop  a  very  good  practical 
estimator.  In  particular,  in  Section  3.4  we  explore  (via  simulation)  the  moderate-sample-size 
performance  of  such  an  estimator  initiated  with  a  physically-based  estimator  motivated  by  a  pro¬ 
cedure  proposed  in  [10].  Our  simulation  of  this  estimator  indicates  that  it  achieves  practical 
efficiency  for  moderate  sample  sizes. 

Some  concluding  remarks  are  contained  in  Section  3.5. 

3,2.  A  Method-of -Moments  Estimator 

The  method  of  moments  is  a  simple,  intuitively  appealing,  and  computationally  expedient 
estimation  technique  introduced  by  K.  Pearson  in  1894.  The  problem  of  estimating  the  parameters 
of  the  Class  A  model  via  this  method  has  been  considered  by  Middleton  in  [lO],[ll],  In  this  section, 
the  asymptotic  properties  of  these  estimates  are  analyzed.  In  particular,  it  will  be  shown  that  the 
method  of  moments  yields  estimates  of  the  Class  A  model  parameters  which  are  strongly  consistent 
and  aymptotically  normal.  Furthermore,  explicit  expressions  for  the  asymptotic  variances  of  the 
normalized  estimates  will  be  obtained  and  computed  for  a  broad  range  of  parameter  values.  The 
performance  of  the  estimator  will  then  be  evaluated  on  the  basis  of  these  computations. 

3.2.1.  Parameter  estimates 

Let  Zi....,Zn  be  a  random  sample  from  the  Class  A  envelope  distribution  w  ( z )  with 
unknown  parameter  vector  0.=  (A  JC  )r  to  be  estimated.  In  the  sequel,  we  assume  that  the  obser¬ 
vations  Zi.i  =  1...../1  are  independent.  The  method  of  moments  consists  of  equating  an  appropri¬ 
ate  number  of  sample  moments  to  the  corresponding  moments  of  the  distribution,  which  are  func¬ 
tions  of  the  unknown  parameters.  By  considering  as  many  moments  as  there  are  parameters  to  be 
estimated,  and  solving  the  resulting  equations  with  respect  to  the  parameters,  estimates  of  the 


latter  are  obtained. 
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For  the  Class  A  envelope  distribution,  the  most  promising  moments  to  use  in  this  context  are 
the  fourth  and  sixth  moments.  These  are  the  lowest -ordered  even  moments  of  interest  (the  second 
moment  is  constrained  to  unity),  and  no  odd-ordered  moments  are  obtainable  in  closed  form.  Use 
of  higher-ordered  even  moments  can  result  in  multiple  solutions  when  equating  sample  and  popu¬ 
lation  moments.  Let  fJLj  denote  the  j  -th  moment  of  w  (  z  ).  Then,  (see,  e.g.,  [13]), 


2A  ,  ,  .  _  6A  ,18  A 

=  7  r-r-mxT  +  2  and  fib  =  - - —  +  +  6. 


(3.1) 


(A  +K  )2  '  '  '  "  °  (A+if)3  '  (A+K)2 

For  A  ^  0.  hence  for  all  0€A,  inversion  of  (3.1)  yields  unique  expressions  for  the  parameters  A 
and  K  in  terms  of  /x4  and  fi6.  Specifically, 


and 


A  = 


M 6  _  3/^4 
6  2  2 


4  /l(/i4.ft) 


(3.2a) 


K  - 


x- 


x"‘ 


M6 

6 


fH 

6 


4  /2  W  ^6). 


(3.2b) 


The  method-of-moments  (MAf  )  estimators  based  on  these  two  moments,  j9  „  =  (  An  JCn  Y  ,  are  then 
given  by 


(3.3a) 


and 
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-1 

m4 

2 

2 

— 

— 

3^4  , 

m  6  3/n  4 

6  2 

6  2 

(3.3b) 


where  m4  and  m6  denote  the  4-th  order  and  6-th  order  sample  moments,  respectively,  i.e.. 
m4  =  —  £  Z /  and  m6  =  —  £  Z/. 


l=i 


n  l=i 


Consider  (3.2a)  and  (3.2b).  Note  that  fx  and  /2  are  discontinuous  only  on 

{(/a4.  /x6)r  €JR2  :  ~  +  2  =  0},  i.e.,  on  {(/*4.  £t6)r  €2?  2  :  //6  =  9/i4  —  12}.  Using  the  expressions 

6  2 


3/j,a  a 

for  /n4  and  /n6  given  in  (3.1),  we  have  that  —  — —  +  2  =  - — 

o  2  (A  +K  )J 


.  Hence,  if  A  >  0  and 


K  >  0.  then  /a6  >  9/*4  —  12  and  nA  >  2.  where  the  latter  inequality  follows  from  the  expression 
for  given  in  (3.1).  Thus,  via  relations  (3.1).  the  parameter  set  A  maps  into  the  open  set 
Cl  4  {(, uA.fxbY  ejR2:fiA  >  2  and  n 6  >  9/*4  —  12}  on  which  fx  and  f2  are  defined  and  continuous. 
This  fact  will  be  used  below. 


3.2.2.  Asymptotic  properties  of  6_n 

In  this  section  we  consider  the  asymptotic  properties  of  the  estimators  of  (3.3).  These  proper¬ 
ties  are  summarized  in  the  following  two  results. 

Theorem  3.1.  (Consistency)  :  The  MM  estimator  Q_n  is  a  strongly  consistent  estimator  of  9_  for 
all  0.  €  A. 

Proof.  Let  jx q  4  (/*4.  (tgY  and  m  Since  Zx . Z„  are  i.i.d.  and  €  JR2  for  j).G  A,  we 

aj. 

have  that  m  -*  jig  by  the  Strong  Law  of  Large  Numbers  (SLLN),  which  implies  that 

lim  m  (3.4) 

n  —ao 

on  a  set  w.p.l.  Now,  A  -fx  (jxg  )  and  An  —fx  (  m  ).  Thus,  it  follows  from  continuity  of  fx  on  Cl 
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and  (3.4)  that  lim  An  =  A  on  a  set  w.p.l.  Similarly,  since  K  —  and  Kn  =  fz  (  21  )•  we 

n  -*oo 

a.  ^  aj. 

have  by  continuity  of  /2  on  A  and  (3.4)  that  lim  Kn  =  K  on  a  set  w.p.l.  Thus.  -*  0.  and  the 

71  “*00 

proof  of  consistency  is  complete.  □ 

Theorem  3.2.  (Asymptotic  Normality )  :  For  each  0.6  A.  \fn  (§_„  —0.)  is  asymptotically  normal 
with  mean  zero  and  covariance  matrix  ,  where 

Varg(Z4)  Covg(Z4.Z6) 

t*  & 

Covq(Z4.Z 6)  VargCZ6) 

and 

3A  3A 

df*4  6a 6 

Bl  4 

6/2  6/2 

a^4  a/*e 


Proof.  Let  4  (/A4.  •  G  4  (m4.  m  6y  and  X„  =  (Z  /  ,Z  ^  )r  where  Z 1 . Z„  is  our  sequence  of 

i.i.d  observations.  Then  is  i.i.d.  with  mean  vector  ^9  and  covariance  matrix  (Expres¬ 

sions  for  the  elements  of  £9  will  be  given  below.  We  note  here  that  all  are  defined  and  finite  for 

^  n  1  n 

0. 6  A.)  Since  m  4  =  —  £  Z  „4  and  m  6  =  —  £  Z  $  ,  we  have  by  the  multivariate  Central  Limit 

n  i/=i  n  i/=i 

Theorem  (CLT)  (see  [14],  Thm.  5.1.8)  that 

[•Jn  (m4  —  m4)  .  Vn" (m6  —  m6)]r  -*  N (.  0..  ^9) 

as  n  -*oo.  Now,  if  it  can  be  shown  that  A  and  A  are  real-valued  functions  of  u.$-  defined  and  con¬ 
tinuously  differentiable  in  a  neighborhood  9)  of  and  such  that  the  matrix  Bg  is  nonsingular 
in  cu,  then  (see  [14],  Thm.  5.1.9) 
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[\fn  (A  ( m  )  —  fi  (ite))  .  Jn  (/2  C  m  )  —  f2  ( i*e))F  -*  N  (  0.,  i?e  )  (3.5) 


as  n  -*oo,  whence  it  follows  that 


[n/tT  C4„  —  A)  .  \fn  (jtn  -KW-+N  (Q_.Bei6.Bl ) 


(3.6) 


as  n  -*oo.  Note  that  we  are  concerned  with  the  validity  of  (3.5)  only  for  all  ^e.  corresponding  to 
0_€  A.  Hence,  we  proceed  to  verify  the  above  conditions  for  all  ji<^  in  the  open  set  ft. 

i)  That  fi  and  f2  are  real-valued  functions  on  ft  is  clear. 

5/ 

ii)  To  show  that  /t  and  f2  are  continuously  differentiable  on  ft.  it  suffices  to  show  that  —A  and 

0/i4 

0/' 

— -,i  =1.2.  are  continuous  on  this  set.  Thus,  let  us  consider  the  form  of  these  partial  deriva- 
0M6 

tives:  Let  a  4  —  1.  )3  4  •—  —  +  2.  Then, 

2  6  2 


0A 

0^4 

0/l 

0^6 


=  3 


+J 


0/2  _  3a  +  1  _  ^ 

2?  f 


(3.7a) 

(3.7b) 

(3.7c) 


and 


0/2  _  _  a  1 

0A*6  6/32  3 


a 


(3.7d) 


Note  that  the  partial  derivatives  are  discontinuous  only  on  {^.9  €  l?2  : /3  =  0}.  But 

{jie  €  JR2  :  +  2  =  0}nft=<£by  definition  of  ft. 

—  6  2 


iii)  To  show  that  Be  is  nonsingular  for  ft .  we  must  show  that  det5g  ^  0  for  all  such  £.9. 


Now, 
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det  Bq  = 


dfl 

5/2 

_  dfl 

dh 

*1  5/^6 

Substituting  the  expressions  given  for  the  partial  derivatives  in  (3.7a-d)  and  simplifying,  we  have 
that 


det  Be  = 


—Of3 

12/F 


lite 


which  is  nonzero  iff  a  ^  0.  Thus,  Bq  is  nonsingular  for  all  jmq  with  /jl4  ^  2.  But 
ClC{ji0  6  R2  :  2}  by  definition  of  Cl.  Thus.  J3g  is  nonsingular  for  all  a_e_ e  Cl. 

The  verification  of  the  conditions  is  now  complete.  Consequently.  (3.6)  holds  for  all  0_  €  A, 


i.e.,  0.„  is  asymptotically  normal 


9_,  —Bg^^j^Bf 


for  0  €  A. 


□ 


A  A 

3.23.  Asymptotic  performance  of  An  and  Kn 

A  A 

The  performance  of  the  normalized  MM  estimates  An  I  A  and  Kn  IK  will  now  be  considered. 
Expressions  for  the  asymptotic  variances  of  these  estimates  will  be  obtained  and  computed  for  a 
broad  range  of  parameter  values  in  A.  Note  that  the  normalizations  are  necessary  in  order  that  a 
meaningful  comparison  of  the  computed  variances  be  made  since  the  parameters  take  on  widely 

varying  values.  Let  C\ 4  0]  and  C2  4  [0.  ~].  Then,  from  (3.6),  we  have  that 

A  A 

Ci  [V?T  (An  -  A  )  ,  yfc  (Kn  -  K)]r  -  N  (0  .  C15i^i5/C1r  ) 

and 

C2[^an  -A),v£"(£„  -KW°N(0,C2BeieB[CS)  . 


i.e.. 


N<S>  .C^B^BlCl) 


and 


(3.8a) 
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^  -  V1  -* N(O.C2BeieB[C$ ). 


(3.8b) 


Let  and  €  K  denote  the  asymptotic  variances  of  An  /A  and  Kn  IK  .  respectively.  Then. 

6^  =  C^gSfCf  (3.9a) 

and 

=  C2Bt%tB[Cl  .  (3.9b) 

We  seek  expressions  for  these  variances  in  terms  of  the  parameters  A  and  K .  To  this  end.  we 
begin  by  substituting  the  corresponding  matrices  for  B^,  ^g ,  Ci .  and  C2.  From  (3.9a)  we  have  that 


a/i  a/i 

3^4  a^6 


Vare(Z 4)  Covg(Z\Z6) 


a/2  a/2  |Cov8(Z4.Z6)  Var8(Z6) 

a^  a^6 


i  aA  , 

— t  -r —  Vor9  ( 
A 2  3^4  ^£  ~ 


Similarly,  (3.9b)  yields 


a/i  j  dh 


a/i  a/i 

a/^4  a/^6 


Var  g(Z4)  Cov9  (Z4.  Z6) 


€*  =  0  ‘ 


a/2  a/2  |  Cove. (Z4.  Z6)  Vare(Z6) 

3^4  3m6  Me 


Var9(Z4)  +  2Cove(Z4.Z6) 

3^4  *£  -  ~  3^4 


a/2 

3^6 


.  and  C2.  F 

a/i 

a/i 

3/*4 

3/^6 

a/2 

3/2 

3/*4 

3/^6 

4. 

a/i 

3^6 

a a 

a/i 

3^4 

3^6 

a/2 

a/2 

3^4 

3^6 

+ 

a/2 

i±£ 

3^6 

(3.10a) 


Varg  (Z6)  . 


(3.10b) 
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Expressions  for  tie  partial  derivatives 


dfi  dfi  . 
dfJ-4  '  3/^6  ’ 


1,  2,  have  been  given  in  (3.7a-d)  in  terms  of 


(jl4  and  which  in  turn  can  be  expressed  in  terms  of  A  and  K  via  (3.1).  Expressions  for  the 
eighth,  tenth,  and  twelfth  moments  of  w  (  z  )  can  be  obtained  in  a  manner  completely  analogous  to 
that  used  in  obtaining  fi4  and  fjL6.  Thus.  Var^(Z4).  Var^Z6).  and  ,Cov^(Z4.  Z6)  are  readily 
expressed  in  terms  of  A  and  K .  The  final  results  are  stated  here: 


3/i 

3/^4 


=  304  +H if)3  +  -Ju  +K)2. 
as  2 


(3.11a) 


3/i 

3^6 


H9 


-jU  +K)3. 


3/2  _  3  U+iT)4  ^  (1—6/4  ) 

Q/j.4  ng  2  A  2 A 


( A  +K)3  -  1(A  +K)2. 


3/2 

3/^6 


i£9 


1  C4+if)4  J 
6  /4 


I(A  +*)3. 


(3.11b) 

(3.11c) 

(3. lid) 


Varo  (  Z  4  )  =?Q4.  136A  +  (244  -  4/4  2)  +  24/4  (74  +4if  ) 

(A+K)2  ( A+K) 4  (4  +iT  )4 


Var0(Z6)  =  — ZH®  [  (A  6+  15/4  5  +  65/4  4  +  90/4  3  +  31AZ  +  A) 
G4+if)6 

+  (6/4  5K  +  60 A  4K  +  150 A  3K  +  90A  2K  +  6AK  ) 


(3. lie) 


(3. Ilf) 


+  (154  4K2  +  90A  3K2  +  105/4 2K2  +  1 5AK2) 


+  (,20 A  3K3  +  60A  2K3  +  20/4  if3) 

+  (15/4  2K4  +  15/4  if4  )  +  (6AK5  +  if6)  ] 

6A  18^4  +5 

(A+K)3  (A+K)2 


|2 
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Covg  (Z  4  .  Z  6  )  =  “0  [CA*+  10A  4  +  25A  3  +  15A  2  +  A  ) 

(a  +a-  y 

+  (5A  “AT  +  3 OA  *K  +  35 A  2K  +  5AK  ) 

+  (10  A3K2  +  30  A2K2  +  lOAiT2) 


+  (ioa2.k:3  + 10A.&:3)  +(5Ai:.4)  +ks] 


2A  +  2 

6A  +  ISA  ^ 

(. A+K) 2 

(a  +k  y  (a  +k  y 

(3.11g) 


The  asymptotic  variances  and  have  been  computed  for  {(A  .  K  )r  €  A|  log  A  €  {0. — 1 . — 2} 
and  log  if  €  {—2.— 3,— 4.— 5.— 6}}  using  (3.10a).  (3.10b),  and  (3.11a-g)  (see  Tables  3.1  and  3.2).  In 
addition,  values  for  the  asymptotic  mean-square  norm  relative  error  (MSNRE)  €r  .  €r  4  ■ 

are  given  in  Table  3.3.  Note  that  the  primary  contribution  to  €r  is  from  .  In  fact,  for  K  <<  A  , 
the  contribution  to  €r  from  is  negligible  and  it  is  for  this  case  that  €r  becomes  extremely 
large.  Hence,  consider  . 

For  A  fixed.  €x(AJf)  increases  as  K  decreases  and  takes  on  very  large  values  when 
K  <  <  A  .  For  K  fixed.  (A  JC  )  increases  as  A  increases  and  again  takes  on  very  large  values 
when  A  >>  K  .  Note  specifically  that  when  A  - 1  the  asymptotic  variance  6^  becomes  extremely 
large.  These  observations  are  easily  explained  when  one  considers  the  form  of  and  /u. 6  (see  (3.1)). 


Table  3.1.  ASYMPTOTIC  VARIANCE  OF  A„  /A  (£*) 


K 

A 

1CT2 

1(T3 

i<r* 

10~s 

1CT6 

10~2 

2.550337X103 

2.468429X103 

2.460431X103 

2.459633X103 

2.459554X103 

nr1 

1.472224X103 

1.448790X103 

1.446479X103 

1.446248X103 

1.446225X103 

1 

4.792669X103 

4.750635X103 

4.746463X103 

4.746046X103 

4.746005X103 
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Table  3.2.  ASYMPTOTIC  VARIANCE  OF  Kn  IK  (  €*  ) 


6.663680X103 

5.264450x10“ 

5.963216X106 

6.041263X10* 

6.049149X1010 

2.449795X104 

2.951534x10* 

3.005101X10* 

3.010491X1010 

3.011030X1012 

1.073910X107 

1.083890x10’ 

1.084889X1011 

1.084989X1013 

1.084999X1015 

Table  3.3.  ASYMPTOTIC  MSNRE  FOR  MM  ESTIMATOR  (  6r  ) 


io-2 

3.216705X103 

5.511293X104 

5.965677X106 

6.041288X10* 

6.049150X1010 

10'1 

2.597017x10“ 

2.952983X106 

3.005115X10* 

3.010491X1010 

3.011030X1012 

1 

1.074389X107 

1.083895X109 

1.084889X10" 

1.084989X1013 

1.084999X10" 

IS 


Note  that  fj.4  and  n6  depend  inversely  on  powers  of  ( A  +K ),  and  K  appears  only  in  this  way. 
Thus,  when  K  <  <  A  ,  pl4  and  are  insensitive  to  changes  in  K .  This  is  evident  in  Tables  3.4  and 
3.5.  For  A  fixed,  it  becomes  increasingly  difficult  to  resolve  pi4  [pi6]  as  K  decreases,  particularly  for 
K  <<  A  .  hence  the  increasing  values  for  .  The  second  observation  stated  above  is  accounted 
for  by  the  fact  that,  for  K  fixed,  increasing  A  makes  K  relatively  small  in  comparison  to  A.  , 
which  in  turn  makes  the  moments  and  the  estimator  less  sensitive  to  K .  For  A  =  1  ,  [//6]  are 

nearly  the  same  for  all  K ,  since  K  <  <  A  for  all  values  of  K  under  consideration.  In  fact.  [/a6] 
for  K  =10-5  equals  Lu6]  for  K  —  10-6  up  to  five  significant  digits.  Thus,  in  order  that  a  reason¬ 
able  estimate  of  K  be  obtained,  the  number  of  samples  used  must  be  large  enough  so  that  a  resolu¬ 
tion  of  the  moments  up  to  five  decimal  places  is  achieved.  (It  should  be  noted  that  this  insensi¬ 
tivity  of  the  moments  to  changes  in  the  value  of  K  when  K  <<  A  is  also  evident  in  the  higher- 
ordered  moments.) 

Consider  the  worst-case  error  for  €r ,  which  occurs  for  A  -  1.  K  =  10-6.  A  comparison  of 
Tables  3.2  and  3.3  indicates  that  this  quantity  is  essentially  the  asymptotic  variance  .  Given 
that  A  =  1.  K  =  10-6,  suppose  we  want  the  probability  that  (A„  /A  ,  Kn  IK  )  lies  within  a  circle  of 
radius  0.1  with  center  at  (1.1)  to  be  0.9.  Let 


C  4 


A  0 

0  J_ 
K 


(3.12) 


Since,  from  (3.6).  we  have  that 


\fn 


An  ~  A 


\/n 


Kn-K 


NiO.CBeteBlC1"). 


a  straightforward  calculation  shows  that 


(An  -  A)2  (Kn  -K)2 


K2 


<  n  (O.l)2 


0.9 
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Table  3.4.  FOURTH-ORDER  POPULATION  MOMENT  (/*„) 


io-2 

10"3 

10““ 

10~s 

io-* 

10-2 

5.200000X101 

1.672893X102 

1.980592X102 

2.016006X102 

2.019600X102 

icrl 

1.852893X10* 

2.160592X10* 

2.196006X101 

2.199600X101 

2.199960x10* 

l 

3.960592 

3.996006 

3.y99600 

3.999960 

3.999996 

Table  3.5.  SIXTH-ORDER  POPULATION  MOMENT  (^) 


a 

10-2 

10"3 

10"“ 

10~5 

10"6 

10-2 

7.9560000X103 

4.6572491x10“ 

6.0005942X10“ 

6.1622765X10“ 

6.1787644X10“ 

m 

6.0554921X102 

7.6480738X102 

7.8384413X102 

7.8578404X102 

7.8597840X102 

i 

2.9468870X101 

2.9946090X101 

2.9994601x10* 

2.9999460X10* 

2.9999946X10* 
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requires  n(0.l)2  to  be  approximately  2.94  XlO15,  or  n  to  be  approximately  2.94  XlO17.  This  is  an 
unrealistically  large  sample  size  for  most  applications  and  thus  reveals  the  moments  estimator  to  be 
highly  inefficient  in  this  sense.  The  potential  poor  performance  of  the  moments  estimator  has  been 
verified  via  simulation  for  a  wide  range  of  parameter  values.1  Whether  this  inefficiency  is  inherent 
in  the  Class  A  model  or  is  a  property  of  the  method-of-moments  estimator  will  be  determined  in 
Section  3.3. 

33.  Asymptotically  Efficient  Estimation 

We  have  seen  in  the  previous  section  that  the  method  of  moments  yields  a  strongly  consistent 
and  asymptotically  normal  estimator  of  the  parameters  of  the  Class  A  model.  However,  the  MM 
estimator  has  a  serious  shortcoming.  Specifically,  for  values  of  K  <  <  A  ,  the  asymptotic  MSNRE 
is  astoundingly  large.  This  is  due  to  the  fact  that  the  MM  estimator  is  highly  insensitive  to 
changes  in  the  parameter  K  when  K  <<  A  .  A  natural  question  that  arises  is  whether  this  insensi¬ 
tivity  is  a  property  of  the  MM  estimator  or  is  an  inherent  feature  of  the  Class  A  model,  i.e.,  is  it 
possible  to  improve  on  the  performance  of  the  MM  estimator?  To  answer  this  question,  let  us 
examine  the  Cramer-Rao  Lower  Bound  (CRLB). 

3.3.1.  The  Cramer-Rao  Lower  Bound 

We  begin  with  the  following  assertion.  Let  .0  *  =  (.A*.  K*Y  denote  an  estimator  of 
0.=  (A  based  on  n  i.i.d.  observations.  Under  regularity  conditions  on  the  class  of  estimators 

0.*  under  consideration  [  15],  it  may  be  asserted  that  if  0.*  is  asymptotically  normal  with  mean 
vector  J0  and  covariance  matrix  then  the  condition 

C[$i-[/(fl)]-l]Cr  £0  (3.13) 

must  hold,  where  C  is  the  nonsingular,  symmetric  matrix  defined  in  (3.12)  and  7(0.)  denotes 
Fisher's  information  matrix: 

‘This  inefficiency  is  also  corroborated  for  a  restricted  range  of  parameter  values  by  experimental  results  presented 
in  [12]. 
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/(!)=£* 


A  log  W  (Z  ) 


A-  log  w(Z) 


Xlogw(Z) 


A  logwCZ) 


log  w(Z) 


Jgr  lOg  W  (Z  ) 


.  (3.14) 


It  follows  from.  (3.13)  that 

tr[C[%6-[ni))-l}CT]>  0 

which,  in  turn,  yields  the  following  lower  bound  on  the  asymptotic  MSNRE  of  0.*: 

fr[C2$e]  ^  fr[C2[/(A)]-1]-  (3.15) 

Now,  if  0.*  is  an  asymptotically  efficient  (AE )  estimator  of  0.,  then  ^  =  [/(^.)]  1  and  (3.15) 
holds  with  equality.  Thus,  let  €'  denote  the  asymptotic  MSNRE  for  an  AE  estimator.  Then. 

—  tr[C2  [I (A)]-1  !•  (3.16) 

We  will  now  consider  the  contribution  to  €r  due  to  estimating  each  parameter  in  the  two- 
parameter  estimation  scheme.  Let  denote  the  asymptotic  relative  variance  due  to  estimating  A 
via  an  AE  estimator  and  let  denote  the  corresponding  quantity  for  the  parameter  K .  Since 
=  [/(A)]-1  for  an  AE  estimator,  it  follows  that 


e;  =  v,  ( c2  i'±)  vf  =  [  c2  [/(A)]-1  ]  v[ 

(3.17a) 

and 

€*  =  V^2  (  C2  $9  )  Vl  =  V2[C2ll  (A)]'1  ]  W 

(3.17b) 

where  4  [l.0]andV2  4  [O.l].  Thus,  (3.17a)  and  (3.17b)  imply  that 

e;  =[/(A)]fi1M2 

(3.18a) 

and 

*'k  =[/(A)]2_21/ir2 

(3.18b) 

where  [/  ( A)]//  denotes  the  jj-th  element  of  the  inverse  of  the  matrix  I  (A)- 
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Examining  the  theoretical  lower  bounds  on  the  asymptotic  MSNRE  and  asymptotic  relative 
variances  as  given  by  (3.16)  and  (3.18a,b),  we  can  determine  whether  an  improvement  in  perform¬ 
ance  over  the  method-of-moments  estimator  is  possible.  Moreover,  the  degree  of  improvement  pos¬ 
sible  can  be  ascertained  by  a  comparison  of  and  €A  ,  and  €* ,  €r  and  €r .  The  quantities 
€.4  .  fcjf  .  and  €r  have  been  computed  and  tabulated  for  {(A  JC )7  6  A|  log  A  fc  {0,-1—  2}  axid 
log  K  €  {—2.— 3.— 4.— 5,— 6}}  (see  Tables  3.6  -  3.8).  A  discussion  of  the  results  follows. 

We  note  by  a  comparison  of  €A  and  €A  (Tables  3.1  and  3.6),  and  £x  (Tables  3.2,  3.7), 
and  €r  and  €r  (Tables  3.3,  3.8)  that  the  values  for  the  asymptotic  variances  and  asymptotic 
MSNRE  dictated  by  an  AE  estimator  are  significantly  lower  than  the  corresponding  values  for  the 
MM  estimator  for  all  parameter  pairs  under  consideration.  Moreover,  whereas  for  the  MM  esti¬ 
mator.  the  primary  contribution  to  €r  is  from  ,  for  the  AE  estimator,  neither  €A  nor  dom¬ 
inates. 

Consider  the  asymptotic  MSNRE.  Since  €r  ^  €* .  the  moments  estimator  is  not  only 
inefficient  in  the  sense  described  in  Section  3.2.3,  but  it  is  not  asymptotically  efficient.  Moreover, 
the  apparent  improvement  in  performance  yielded  by  an  AE  estimator  is  tremendous,  particularly 
for  values  of  K  <  <  A  ,  i.e.,  in  the  region  where  the  MM  estimator  is  most  inefficient.  Note  also 


Table  3.6.  ASYMPTOTIC  RELATIVE  VARIANCE  DUE  TO 
ESTIMATING  A  VIA  AN  AE  ESTIMATOR  (€A  ) 


K 

icr2 

10"3 

10"4 

1(T5 

10-6 

A 

m 

6.2440X101 

5.1972X101 

5.0865X101 

5.0735X101 

5.0720X101 

10‘1 

6.7196 

5.7920 

5.6860 

5.6732 

5.6716 

1 

1.5909 

1.3509 

1.3224 

1.3188 

1.3184 

10~2 


10-* 


10“* 


6.2945X101 

5.2961X101 

5.1869X101 

5.1744X101 

5.1730X101 

7.5054 

6.8492 

6.7839 

6.7773 

6.7767 

4.8083 

4.2365 

4.0854 

4.0470 

4.0385 

10~2 

10~3 

10“« 

10~5 

io-6 

1.2539X102 

1.0493X102 

1.0273X102 

1.0248X102 

1.0245X102 

1.4225X101 

1.2641X101 

1.2470X101 

1.2451X101 

1.2448X101 

6.3991 

5.5874 

5.4078 

5.3658 

5.3569 
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that  the  maximum  and  minimum  values  for  €r  are  1.0850  x  1015  and  3.2167  x  103,  respectively,  as 
opposed  to  a  maximum  and  minimum  value  of  1.25  39  x  102  and  5.3569.  respectively,  for  6r. 
Furthermore,  €r  achieves  its  maximum  value  at  the  same  point  at  which  €r  achieves  its  minimum 
value,  namely  at  A  =  1  ,  K  =  10-6.  In  fact,  the  improvement  at  this  point  is  on  the  order  of  1015. 
Thus,  there  is  a  dramatic  improvement  in  performance  in  the  area  where  it  is  most  needed. 

3.3.2.  Likelihood-based  estimators 

The  question  posed  at  the  beginning  of  the  section  has  now  been  answered.  In  particular,  the 
high  insensitivity  of  the  MM  estimator  to  changes  in  the  parameter  if  is  a  feature  of  the  estimator. 
It  is  not  a  feature  of  the  model.  Moreover,  we  can  expect  a  significant  improvement  in  performance 
given  that  an  asymptotically  efficient  estimator  can  be  found.  Naturally,  the  search  for  such  an 
estimator  begins  with  maximum  likelihood. 

Unfortunately,  maximum-likelihood  estimation  for  the  Class  A  model  turns  out  to  be 
unwieldy.  Numerical  experimentation  with  the  likelihood  equation  (LE)  reveals  that  the  LE  can¬ 
not  be  readily  solved  for  the  maximum-likelihood  estimator.  In  particular,  the  likelihood  function 
has  steep  gradients,  the  LE  has  multiple  roots  for  finite  sample  sizes,  etc.  Moreover,  closer  exami¬ 
nation  of  the  LE  in  the  context  of  estimating  a  single  parameter  (fixing  K  and  considering  the  prob¬ 
lem  of  estimating  A  only)  reveals  that  the  LE  does  not  have  a  unique  root  asymptotically  for  all 
(A  JC )r  €  A.  For  example,  for  (A  JC}  =  (lO^.lO-4),  the  LE  has  roots  at  A  =  10~2  and  at 
A  =  0.308312  asymptotically.  This  multiplicity  of  roots  implies  the  existence  of  inconsistent 
sequences  of  roots  to  the  LE  when  the  problem  of  estimating  A  only  with  K  known  is  considered 
and  a  similar  phenomenon  may  be  the  source  of  difficulty  in  the  two-parameter  situation. 

Thus,  we  have  a  consistent  estimator  (the  MM  estimator)  which  is  highly  inefficient  and  we 
have  a  potentially  efficient  estimator  which  is  computationally  difficult  (and  possibly  inconsistent). 
However,  the  consistency  of  the  moments  estimator  can  be  combined  with  the  potential  efficiency 
of  likelihood-based  estimation  to  yield  a  consistent  and  asymptotically  efficient  estimator  of  the 
parameters  of  the  Class  A  model.  This  is  done  via  a  standard  procedure  [14]  whereby  Newton's 
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root-finding  method  on  the  LE  is  initiated  with  a  Vn"  -consistent  estimator  and,  in  so  doing,  both 
the  consistency  of  the  consistent  estimator  and  the  efficiency  of  likelihood-based  estimation  are 
retained.  A  natural  initial  estimator  to  use  in  such  a  scheme  for  the  Class  A  estimation  problem  is 
the  MM  estimator,  which  was  shown  to  be  consistent  in  Section  3.2.2.  That  the  MM  estimator  is. 
in  fact.  •Jn  -consistent  follows  from  its  asymptotic  normality.  Thus,  provided  certain  regularity 
conditions  are  satisfied  by  the  densities  w  (see  [14]).  Newton  iteration  on  the  LE  initiated  with  the 
MM  estimate  will  be  consistent  and  efficient. 

As  a  practical  matter,  this  type  of  estimator  will  work  only  if  the  initial  estimate  (the  MM 
estimate)  is  reasonably  close  to  the  consistent  root  of  the  LE.  Unfortunately,  simulation  studies 
using  several  thousand  samples  indicate  that  this  closeness  is  not  achieved  for  most  values  of  A 
and  K  of  interest.  In  fact,  the  MM  estimator  frequently  produces  invalid  (e.g.,  negative)  initial 
estimates  for  these  sample  sizes.  Thus,  even  though,  this  procedure  performs  well  asymptotically, 
there  are  limitations  as  to  how  well  it  can  perform  for  a  moderate  number  of  samples,  the  source  of 
this  poor  performance  being  the  high  inefficiency  of  the  moments  estimator. 

However,  as  we  shall  see  in  the  following  section,  by  starting  with  an  initial  estimator  with 
enhanced  efficiency  and  iterating  around  it  until  Newton's  method  converges,  we  can  obtain  a  better 
estimator.  Hence,  we  turn  now  to  an  estimator  which,  despite  the  fact  that  it  lacks  the  asymptotic 
optimality  properties  of  the  Moment  /Likelihood,  procedure  described  above,  appears  to  be  much 
more  efficient  than  this  procedure  for  moderate  sample  sizes. 

3.4.  Threshold-Comparison/Likelihood  Estimator 

In  this  section,  we  consider  a  practical  estimator  based  on  the  idea  of  using  likelihood  iteration 
initiated  with  a  physically  motivated,  but  nonoptimal,  estimator.  In  particular,  we  consider  a  pro¬ 
cedure  which  uses  as  its  initial  estimator  a  scheme  motivated  by  Middleton's  approximate  empirical 
procedure.  This  latter  estimator,  described  in  [lO],[ll],  is  a  graphical  procedure  based  on  features 
of  the  finite  sample  size  distribution.  The  threshold  comparison  estimates  are  m^t:,,ated  in  the  fol¬ 
lowing  way:  Note  that  we  can  decompose  the  Class  A  envelope  pdf  w  (2.3)  into  two  components. 
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The  first  of  these  corresponds  to  the  m  =  0  term  and  is  attributable  to  the  Gaussian  background 
noise  component;  the  second  corresponds  to  all  terms  indexed  by  m  ^  1  and  is  attributable  pri¬ 
marily  to  the  impulsive  noise  component.  As  it  happens,  these  two  terms  can  be  clearly  dis¬ 
tinguished  for  most  A  and  K  values  of  practical  interest  if  we  consider  the  envelope  distribution 
function  P(Z  >  za  ).  Typical  envelope  distributions  are  shown  in  Figs.  3.1  and  3.2.  We  note 
from  the  form  of  this  function  that  it  divides  the  za  -axis  into  three  regions:  the  first  corresponds  to 
smaller  values  of  z0  ,  in  which  the  Gaussian  background  noise  component  dominates;  the  second 
corresponds  to  larger  values  of  z0  ,  wherein  the  impulsive  noise  component  dominates:  and  the  third 
corresponds  to  intermediate  values  of  z0  .  for  which  P  (  Z  >  z0  )  is  virtually  constant.  The  portion 
of  the  distribution  corresponding  to  these  intermediate  values  of  z0  will  be  termed  the  "null 
region."  We  note  that  for  values  of  A  ^  1CT1  and  T  ^  10-3,  the  departure  from  the  straight-line, 
Gaussian  (actually.  Rayleigh,  since  the  envelope  distribution  is  being  considered)  portion  of  the  dis¬ 
tribution  is  abrupt  and  extensive.  In  this  case,  the  null  region  is  clearly  identifiable.  Thus,  we  can 
set  a  threshold  a*  at  any  value  of  the  abscissa  (z0  )  corresponding  to  a  point  in  this  null  region  (see 
Fig.  3.1)  so  that,  with  high  probability,  samples  falling  below  a*  can  be  attributed  to  the  Gaussian 
background  component  and  samples  exceeding  at*  can  be  attributed  to  the  impulsive  component. 
For  values  of  A  <  1 0-1  or  F  >  10-3.  the  departure  from  the  straight-line,  Rayleigh  portion  is 
gradual  and/or  less  extensive,  in  which  case  such  an  or*  can  be  chosen  to  be  the  abscissa  value  (z0  ) 
at  which  the  distribution  begins  to  depart,  observably,  from  its  straight-line  (Rayleigh)  behavior 
(see  Fig.  3.2). 

The  above  feature  of  the  envelope  distribution  can  be  used  to  obtain  estimates  of  the  parame¬ 
ters  A  and  K .  In  particular,  from  an  i.i.d.  sequence,  Z  1# ... ,  Z„  ,  of  Class  A  envelope  samples,  we 

can  determine  an  estimate  of  the  threshold  a*  from  the  sample  distribution  function.  We  then 

divide  the  observations  Z  ! . Z„  into  a  set  {Z  ^ .  Z  2 . Zn[ }  consisting  of  those  lying  above  or* 

and  a  set  {  Z.i  .  Z  2 . Z„a}  consisting  of  those  falling  below  a*  .  Since  A  is  approximately  the 

expected  fraction  of  impulses  in  a  random  sample,  it  can  be  estimated  as 


Iq  (  dB  above  rms  value  ) 
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P(Z>20) 


Fig.  3.2.  Typical  envelope  distribution  for  A  <  10_1  and  T  >  10-3.  Note  the 
gradual  and  brief  departure  from  the  straight-line  (Rayleigh)  portion  of 
the  curve.  (The  distribution  is  plotted  on  linear  (for  z0)  by 
0.5  log10  (-logeP)  coordinates.) 
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An  =nr/n  .  (3.19) 

Similarly,  since  K  is  approximately  the  average  energy  in  a  background  sample  relative  to  that  in 
an  impulsive  sample,  it  can  be  estimated  as 

i  na  I  1  ni 

Kn  =  —  LZ?  I  —  IZi2  •  (3.20) 

nB  i=l  nI  i=l 

These  estimates  will  be  referred  to  as  the  threshold  comparison  estimates  of  A  and  if.  Note  that 
for  values  of  A  ^  10-1  and  T  ^  10-3.  An  is  simply  the  probability  value  corresponding  to  the 
point  where  the  sharp  rise  in  the  sample  distribution  begins,  and  for  values  of  A  <  10'1  or 
T  >  10~3.  A  „  is  simply  the  value  of  P  where  the  sample  distribution  begins  to  depart,  observably, 
from  its  straight-line  (Rayleigh)  behavior.  (The  threshold  comparison  estimator  is  based  on 
Middleton's  approximate  empirical  procedure,  which  is  described  in  [lO],[ll].  We  find  that  the 
threshold  comparison  estimator  is  computationally  easier  to  implement  and  leads  to  more  accurate 
results  than  the  approximate  empirical  procedure.) 

In  order  to  assess  the  performance  of  the  above  estimator,  an  extensive  simulation  study  was 
performed  wherein  the  normalized  sample  MSNRE  (4  n  (sample  MSNRE))  for  the  threshold  com¬ 
parison  estimator  was  computed  for  values  of  A  and  K  throughout  their  practical  ranges.  For 
each  parameter  pair,  the  computation  was  made  using  100  data  sets,  each  containing  3000  observa¬ 
tions  randomly  generated  from  the  corresponding  Class  A  envelope  pdf.  The  values  for  the  nor¬ 
malized  MSNRE  are  tabulated  in  Table  3.9.  Note  that  the  threshold  comparison  estimator  performs 
very  well.  In  particular,  from  a  comparison  of  Tables  3.3  and  3.9,  one  can  infer  that  the  threshold 
comparison  estimator  performs  significantly  better  than  the  MM  estimator  for  moderate  sample 
sizes.  Thus,  the  threshold  comparison  estimator  is  a  good  candidate  for  use  as  an  initial  estimator 
in  a  scheme  whereby  a  solution  to  the  likelihood  equation  is  determined  iteratively.  The  iterative 
determination  of  a  solution  to  the  LE  after  initiating  with  the  threshold  comparison  estimator  will 
be  referred  to  as  the  Threshold  -  Comparison  / Likelihood  Estimator.  Let  us  examine  the  perform¬ 


ance  of  this  estimator. 
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Table  3.9.  NORMALIZED  MSNRE  FOR  THRESHOLD  COMPARISON 
ESTIMATOR  (3000  SAMPLES.  100  RUNS) 


K 

A 

10~2 

fl 

1 

o 

T 

o 

r-i 

10-5 

10~* 

io-2 

2.3885X102 

2.1233X102 

2.1253X102 

2.1332X102 

2.1310X102 

m 

1.0812X102 

3.7563X101 

3.3097X101 

3.2785X101 

3.2810X101 

i 

7.7198X102 

7.4617X102 

8.1020X102 

8.1468X102 

8.1504X102 

Table  3.10.  NORMALIZED  MSNRE  FOR  THRESHOLD-COMPARISON/LIKELIHOOD 

ESTIMATOR  (3000  SAMPLES.  100  RUNS) 


K 

10-2 

10~3 

T 

o 

t-H 

o 

1 

10"6 

A 

IO'2 

1.1841X102 

9-2304X101 

8.7421X101 

8.8302X101 

8.8320X101 

m 

1.2818X101 

1.1732X101 

1.1723X101 

1.1747X101 

1.1736X101 

1 

6.5348 

5.1473 

5.3332 

5.2676 

5.2281 

As  before,  a  simulation  study  was  performed  wherein  the  normalized  MSNRE  for  the 
Threshold-Comparison/Likelihood  estimator  was  computed  for  a  range  of  A  and  K  values. 
Again,  the  computation  for  each  parameter  pair  was  made  using  100  data  sets,  each  containing  3000 
observations.  The  results  are  tabulated  in  Table  3.10.  We  noted  above  that  the  threshold  com¬ 
parison  estimator  performs  very  well.  However,  a  comparison  of  Tables  3.9  and  3.10  reveals  that 
the  Threshold-Comparison/Likelihood  estimator  performs  even  better.  In  particular,  for  A  —  1, 
there  is  a  reduction  in  the  normalized  MSNRE  on  the  order  of  102  for  all  values  of  K  under  con¬ 
sideration.  Moreover,  a  comparison  of  Tables  3.8  and  3.10  indicates  that  the  normalized  MSNRE 
for  the  Threshold-Comparison/Likelihood  estimator  is  very  close  to  the  Cramer-Rao  Lower  Bound 
for  all  values  of  A  and  K  under  consideration. 
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In  addition  to  computation  of  the  normalized  MSNRE,  the  relative  biases  for  the  threshold 
comparison  and  Threshold-Comparison/Likelihood  estimator  were  computed  (3000  samples.  100 
runs)  (see  Tables  3.11.  3.12).  Note  that,  in  addition  to  lowering  the  MSNRE,  the  likelihood  itera¬ 
tion  step  also  serves  as  a  bias  reduction  technique  for  most  values  of  A  and  K .  In  particular,  a 
substantial  reduction  in  the  magnitude  of  the  relative  bias  is  observed  for  A  =  1  and  A  =  10_1  for 
all  values  of  K  under  consideration. 


Table  3.11.  RELATIVE  BIAS  FOR  THRESHOLD  COMP  ARISON 
ESTIMATOR  (3000  SAMPLES.  100  RUNS) 


K 

IQ"2 

10-3 

A 

IQ"2 

-2.0086X10-1 

-3.4793X10- 

2 

•-H 

1 

o 

*— 4 

— 2.3777X10-1 

-1.0400X10- 

1 

1 

— 7.0614X10-1 

-6.9930X10- 

1 

-1.2456X10"2 


Table  3.12.  RELATIVE  BIAS  FOR  THRESHOLD-COMPARISON/LIKELIHOOD 

ESTIMATOR  (3000  SAMPLES,  100  RUNS) 


-3.9698X10-2  -4.3362X10-2 


7.7754x10" 


1.1470X10-3 


8.0989X10-3 


1.4592X10-3  1.1912X10"3 


In  summary  of  the  above,  the  Threshold-Comparison/Likelihood  estimator  has  many  desir¬ 
able  features:  (i)  it  performs  very  well  from  a  practical  viewpoint  (substantially  better  than 
MM -based  estimators):  (ii)  the  normalized  MSNRE  for  the  estimator  is  very  close  to  the  CRLB; 
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and  Ciii)  it  serves  as  a  bias  reduction  technique.  Unfortunately,  the  Threshold- 
Ccmparison/Likelihood  estimator  apparently  lacks  the  asymptotic  optimality  properties  of  the 
Moment/Likelihood  procedure.  In  fact,  it  appears  from  examination  of  the  properties  of  the  popu¬ 
lation  distribution  function  that  this  estimator  is  asymptotically  biased  and  inconsistent.  However, 
as  we  have  seen,  it  works  quite  well  for  moderate  sample  sizes,  and  thus  is  an  attractive  estimator 
for  use  in  applications. 

3-5.  Conclusions 

In  this  chapter,  we  have  proposed  and  investigated  several  batch  estimators  for  the  parameters 
of  the  Middleton  Class  A  noise  model  in  its  strictly  canonical  form.  These  estimators  include  the 
method-of-moments  estimator,  which  is  computationally  attractive  but  is  unattractive  in  terms  of 
performance;  likelihood-based  estimators,  which  are  potentially  efficient  but  which  have  undesir¬ 
able  computational  properties;  and  the  Moment/Likelihood  estimator  which,  in  its  asymptotic 
performance,  combines  the  desirable  features  of  these  two  approaches.  In  response  to  the  poor 
moderate-sample-size  performance  of  the  Moment/Likelihood  estimator  observed  via  simulation,  a 
similar  estimator  that  initiates  likelihood  iteration  with  the  threshold  comparison  estimator  has 
also  been  considered.  Analysis  of  the  moderate-sample-size  performance  of  this  scheme  shows  it  to 
be  an  effective  estimator  for  practical  use  (i.e.,  sample  sizes  on  the  order  of  103). 
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4.  RECURSIVE  IDENITTF CA  TION 

4.1.  Introduction 

In  Chapter  3,  we  addressed  the  problem  of  basic  batch  estimation  of  the  Class  A  parameters. 
In  this  chapter,  we  will  develop  a  recursive  algorithm  for  on-line  identification  of  these  parameters. 
In  particular,  our  objective  is  to  provide  a  global  recursive  estimator  of  the  parameters  of  the  Class 
A  model  which  performs  well  for  all  parameter  vectors  in  the  parameter  set  of  interest.  We  begin 
by  proposing  a  basic,  physically-motivated,  decision-directed  algorithm.  This  decision-directed 
scheme  is  based  on  an  adaptive  Bayesian  classification  of  each  Class  A  envelope  sample  as  being 
either  impulsive  or  background.  As  each  sample  is  so  classified,  recursive  updates  of  the  estimates 
of  the  following  three  quantities  are  obtained:  the  second  moment  of  the  impulsive  component  of 
the  interference  envelope  density,  the  second  moment  of  the  background  component  of  the  interfer¬ 
ence  envelope  density,  and  the  probability  with  which  the  impulsive  component  occurs.  From  these 
estimates,  estimates  of  the  parameters  of  the  model  are  readily  obtained,  since  closed-form  expres¬ 
sions  for  the  parameters  exist  in  terms  of  these  three  quantities.  Examination  of  the  performance 
of  the  proposed  estimator  via  simulation  reveals  two  major  shortcomings  of  the  scheme,  which 
adversely  affect  its  performance  even  in  a  local  setting.  However,  by  appropriately  modifying  the 
basic  algorithm  and  imposing  the  necessary  restrictions  on  the  form  of  one  of  its  initiation  vectors, 
a  global  recursive  estimator  of  the  parameters  of  the  Class  A  model  with  excellent  performance 
characteristics  can  be  obtained. 

The  chapter  is  organized  as  follows.  In  Section  4.2,  the  basic  decision-directed  algorithm  is 
developed  and  its  performance  examined.  Via  a  probability-of-error  analysis,  it  is  seen  that  the 
degradation  in  the  performance  of  the  algorithm  arising  from  its  two  basic  shortcomings  can  be 
alleviated  if  the  necessary  restrictions  are  imposed  and  the  appropriate  modifications  are  incor¬ 
porated.  Consequently,  in  Section  4.3.  an  initiation  procedure  for  each  parameter  which  yields  ini¬ 
tial  estimates  of  that  parameter  satisfying  the  necessary  restrictions  is  presented.  Upon  using  the 
estimates  obtained  from  these  procedures  as  initial  estimates  of  the  parameters,  a  modified  version 
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of  the  basic  algorithm  is  then  proposed  which  incorporates  the  modifications  suggested  by  the  two 
flaws  of  the  algorithm  and  some  additional  modifications  which  are  deemed  necessary  for  improv¬ 
ing  its  performance  in  a  global  framework.  In  Section  4.4.  the  moderate-sample-size  performance 
of  the  modified  algorithm  is  explored  extensively  via  simulation.  From  these  simulations,  it  is  seen 
that  the  proposed  global  decision-directed  scheme  does,  in  fact,  provide  a  global  estimator  of  the 
parameters  that  performs  very  well  for  all  parameter  vectors  in  the  parameter  set  of  interest.  Some 
concluding  remarks  are  contained  in  Section  4.5. 

4.2.  A  Basic  Decision-Directed  (BDD)  Algorithm 

The  problem  of  recursive  estimation  of  the  Class  A  parameters  from  an  independent  sequence 
of  Class  A  envelope  samples  will  now  be  considered.  The  procedure  which  will  be  proposed  in  this 
section  is  a  recursive  version  of  the  Threshold-Comparison  estimator,  which  was  seen  in  Section  3.4 
to  provide  good  estimates  of  the  parameters.  The  objective  in  the  batch  scheme  and  its  recursive 
version  is  optimally  to  discriminate  between  background  and  impulsive  samples,  the  optimality 
criterion  being  minimization  of  the  probability  of  an  incorrect  classification.  Given  that  the  sam¬ 
ples  can  be  so  classified,  accurate  estimates  of  the  parameters  can  then  be  obtained.  Of  course,  the 
optimum  decision  statistic  in  this  case  is  given  by  the  likelihood  ratio  test  (LRT).  Fortunately,  as 
will  be  seen  later,  for  each  parameter  vector  in  the  parameter  set  of  interest,  the  likelihood  ratio 
CLR)  is  strictly  monotone  increasing  in  the  envelope  sample.  Thus,  to  each  parameter  vector  in  the 
parameter  set  of  interest,  we  can  associate  a  unique  threshold  so  that,  for  a  given  observation,  the 
LRT  is  equivalent  to  comparing  that  observation  to  this  optimum  threshold.  The  problem  of 
optimum  discrimination  of  the  samples  has  then  been  transformed  to  the  problem  of  locating  the 
optimum  threshold  corresponding  to  the  true  parameter  vector.  This  is  what  the  proposed  algo¬ 
rithm  attempts  to  do. 

4.2.1.  Formulation  of  the  algorithm 

The  proposed  recursive  scheme  is  a  basic  decision-directed  (BDD)  algorithm  based  on  an  adap¬ 
tive  Bayesian  classification  of  each  of  a  sequence  of  independent  Class  A  envelope  samples  as 
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background  or  impulsive.  The  mathematical  formulation  for  the  algorithm  is  given  as  follows: 
Let  denote  the  unnormalized  Class  A  envelope  pdf,  i.e.. 


w 


tm 


<T2  >  0. 


2 

The  parameter  cr  ,  the  second  moment  of  the  interference  envelope,  was  assumed  to  be  unity  in  the 
previously  considered  batch  estimation  problem,  since,  in  a  batch  setting,  the  data  can  be  easily 
normalized  to  have  unit  second  moment.  As  stated  in  Section  3.4,  we  can  decompose  the  Class  A 
envelope  pdf  into  two  components.  The  first  of  these  corresponds  to  the  m  =  0  term  and  the 
second  corresponds  to  all  terms  indexed  by  m  ^  1.  i.e.. 


W,m  (Z  )  =  e 


-A 


2  z 


2  2 
a  cr0 


+  (l-e-A ) 


2  e 


-A 


z  e 


m  =1  m!trm 


O'2  (1  —  e  A) 


z  ^  0  . 
(4.1) 


=  (1  -tr^poCz)  +  Trlpl(z) 


where 


and 


tt1  4  1  —  e 


-A 


,  ,  A  2Z  ^ 

a 


,2  2 


cr  o-, 


P^z)  4 


2e~A  Z  Am 


-:VV 


0-2  m  =1  ni!<r* 


■  z  e 


(4.2a) 


(4.2b) 


(4.2c) 


In  the  sequel,  p0  will  be  referred  to  as  "the  background  component  of  *  and  p  j  as  "the  impulsive 
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component  of  "1  Thus,  Class  A  envelope  samples  attributable  to  p  0  will  be  referred  to  as  "back¬ 
ground  samples"  and  those  attributable  to  p  l  will  be  referred  to  as  "impulsive  samples."  Let  CTg 
denote  the  second  moment  of  w ^  conditioned  on  the  event  that  p0  occurred,  and  let  (Tj  denote  the 
second  moment  of  conditioned  on  the  event  that  p  t  occurred,  i.e., 

CO  oo 

CTg  =  z2p0(z)dz  and  c r]  -  z2pl(z)dz  .  (4.3) 

From  (4.2a)-(4.2c)  and  (4.3),  we  then  have  that 

v1  =  1  —  e~A  .  (4.4a) 


K 

A  +  K 


(4.4b) 


and 


A  +vlK 
irx(A  +K ) 


(4.4c) 


These  equations  can  be  readily  inverted  to  yield  unique,  closed-form  expressions  for  the  parameters 

2  2 

of  the  Class  A  model  in  terms  of  irv  <r3 ,  and  <rl .  Specifically, 


A  =  —  In  (1  —7^)  , 


(4.5a) 


K  = 


—  In  (1—7 Tj) 


<7 


2 

B 


7rl°'/ 


—  TTjCT 


2 

B 


(4.5  b) 


and 

<72  =  (1  —  TTj)  O-j  +  7TjC r/  (4.5c) 

for  all  (A  ,  K .  c r2)  €  A  ,  where  A  4  {(A  ,  K ,  o’2) :  (A  ,  K  )r  €  A  and  cr2  >  0}. 


*That  p  „  can  juatiflably  b«  refened  to  aa  the  "background  component  of  v"  follows  from  the  discussion  given  in 
Chapter  2.  Furthermore,  since  p  j  is  primarily  attributable  to  the  impulsive  component  of  the  input  noise  and,  since  the  vari¬ 
ance  of  p  j  is  significantly  larger  than  that  of  p  „  for  parameter  vectors  in  the  parameter  set  of  interest,  p  j  can  also  justifiably 
be  referred  to  as  "the  impulsive  component  of  v  Again,  this  terminology  is  consistent  with  that  given,  e.g.,  in  [16], 
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Given  that  we  can  distinguish  between  envelope  samples  attributable  to  the  background  com¬ 
ponent  of  and  those  corresponding  to  the  impulsive  component,  then,  based  on  the  definitions 
2  2 

of  crfl,  <Tj ,  and  7r1,  we  can  use  as  estimates  of  these  quantities  the  sample  second  moment  of  those 
samples  classified  as  background,  the  sample  second  moment  of  those  samples  classified  as  impul¬ 
sive.  and  the  frequency  with  which  the  impulsive  samples  occur,  respectively.  From  these  esti¬ 
mates.  we  can  then  obtain  estimates  of  A  ,  K ,  and  cr2  using  the  relations  given  in  (4.5a)-(4.5c). 
Since  the  classification  of  each  sample  as  "impulsive"  or  "background"  can  be  performed  using  a 
likelihood  ratio  test  based  on  that  sample  and  on  the  estimate  of  (A  ,  K .  or2)  obtained  at  the  previ¬ 
ous  iteration  of  the  algorithm,  we  are  now  in  a  position  to  propose  the  following  decision-directed 
recursive  scheme  for  estimating  the  parameters  of  the  Class  A  model,  wherein  an  adaptive  Bayesian 
classification  of  each  sample  as  impulsive  or  background  is  performed  : 

Basic  Decision-Directed  (BDD)  Algorithm 

Step  1  :  Choose  the  initiation  vectors.  Choose  (tt^CO).  <xj( 0).  -n^cr/CO))  arbitrarily  and 

(A 0,  K0.  6r0)  €  A  .  (A  tilde  above  a  given  quantity  denotes  the  estimate  of  that  quantity  for  the 

2  2 

iteration  shown  after  that  estimate  either  parenthetically,  when  estimates  of  rrv  crg,  or  j  are 

2 

being  considered,  or  as  a  subscript,  when  estimates  of  A  .  K ,  or  cr  are  being  considered.  Note  th^t 

2  2  2 

we  are  considering  estimates  of  tt^j  instead  of  cr;  since  <Jl  always  appears  in  conjunction  with  tt1 
in  the  expressions  for  the  parameters  given  in  (4.5a)-(4.5c).)  The  initiation  vector 
(7r1(0).  o-JCO).  tt jO 2(0))  is  chosen  arbitrarily  since,  as  will  be  seen  from  the  form  of  the  update 
equations  given  in  Step  3.  the  performance  of  the  algorithm  is  independent  of  this  choice.  At  the 
n-th  iteration  (n  ^  1),  we  have  the  estimate  vectors  (ir^n—  1),  crB(.n—  1),  7^07  (n—  1))  and 
(A„  _t,  Kn  _j,  &n  _x)  and  we  observe  the  n  -th  sample  Zn  . 

Step  2  :  Classify  Z„  as  an  impulsive  sample  or  as  a  background  sample  using  a  likelihood  ratio  test 
based  on  the  estimate  of  (A  ,  K ,  o2)  obtained  at  the  ( n  — l)-st  iteration. 
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Let 


= 


if  /  (Z„  :An_v  Kn_v  o-;1_1)  >  1 


0  if  f(Zn:An  _v  Kn  _v  <  1 


where  /  is  the  likelihood  ratio  function  normalized  so  that  the  threshold  of  the  LRT  has  unity 
value,  i.e.. 
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(4.6) 


and  where 

<f>n  =  1  =>  Zn  is  classified  as  an  impulsive  sample, 

<f>n  =  0  =>•  Zn  is  classified  as  a  background  sample. 

(The  functions  pQ  and  px  have  been  defined  in  (4.2b)  and  (4.2c).  respectively.  Here  the  dependence 
of  these  functions  on  the  parameters  is  made  explicit.)  In  the  sequel,  the  function  /  will  be 
referred  to  as  the  normalized  likelihood  ratio  (NLR)  function. 

Step  3  :  Update  recursively  the  estimates  of  wl#  erg,  and  tTjCT 2.  (These  three  parameters  will  be 
referred  to  as  the  update  parameters.)  :  For  n  ^  1,  let 


tt^  n  )  =  7T x(  n  —  1  )  +  —  (  —  rrx(,  n  —  1  )  ) 
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and 

( n  )  =  -  1  )  +  —  C <f>n  Z„2  -  Tr^ar/ C n  -  1  )  )  .  (4.7c) 

n 

Note  that  at  the  first  iteration  of  the  algorithm  (n  =  1),  execution  of  this  step  results  in  cancellation 
of  the  terms  involving  the  initial  estimate  of  ir1  in  (4.7a)  and  those  involving  the  initial  estimate  of 
TTiOrf  in  (4.7c).  Furthermore,  at  the  first  iteration  n  of  the  algorithm  for  which  a  given  sample  is 
classified  as  a  background  sample,  the  first  portion  of  (4.7b)  becomes  effective  and  execution  of  this 
step  results  in  cancellation  of  the  terms  involving  the  initial  estimate  of  crB  in  this  portion  of 
(4.7b).  Since,  as  will  be  seen  in  the  next  step  of  the  algorithm,  the  classificiation  of  each  sample  as 
background  or  impulsive  depends  only  on  the  value  of  the  initiation  vector  (A  0.  K  0,  cr0 )  when 
n  ^  n  .  it  follows  that  for  a  given  sample  sequence  the  values  of  •jf1(l).  <xfl(n').  and  7r1cr/  (l)  are 
unaffected  by  the  choice  of  Oir^O).  arB(0).  ir^j  (0)).  Moreover,  the  estimate  of  each  update 
parameter  obtained  at  a  given  iteration  depends  on  the  estimate  of  that  parameter  obtained  at  the 
previous  iteration  only.  Consequently,  for  a  given  sample  sequence,  initiation  vector  (A  0.  K0.  cr0), 
and  fixed  value  of  n  ^  n  .  the  estimate  vector  (w,1(n  ),  crs  (n  ).  7r1c r7  (n  ))  will  have  the  same  value 

independent  of  the  choice  of  the  initiation  vector  (w'1(0),  cB  (0),  7r1cr/  (0)).  Thus,  as  claimed  in 

Step  1.  the  performance  of  the  algorithm  is  unaffected  by  the  choice  of  the  initial  estimates  of 

2  2 
TT j,  CT^  ,  and  7T ^(7 ^  . 

Note,  in  addition,  that  the  estimates  of  the  update  parameters  given  by  recursions  (4.7a)- 
(4.7c)  axe  motivated  entirely  by  their  definitions:  For  n  ^  1,  7r x(n)  as  given  by  (4.7a)  is  simply 
the  proportion  of  samples  that  have  been  classified  as  impulsive  by  the  n-th  iteration:  for 
n  ^  n'.  crB(n)  is  the  sample  second  moment  of  those  samples  that  have  been  classified  as  back¬ 
ground  by  the  n-th  iteration:  and  lastly,  for  n  ^  1,  is  the  proportion  of  those  samples 

that  have  been  classified  as  impulsive  by  the  n  -th  iteration  times  the  sample  second  moment  of 
those  samples. 
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Step  4 :  Obtain  estimates  of  the  parameters  of  the  Class  A  model.  First,  let 

n  n 

Z L(i-<t>l') 

1=1  i=i 

and  let  n*  denote  the  minimum  value  of  n  for  which  an  is  nonzero.  Then,  for  n  <  n*  ,  set 
(A„  .Kn.&n)  =  (An  _x.  Kn  &n  _,).  For  n  £  n* ,  obtain  estimates  of  the  Class  A  parameters  in 

terms  of  irv  6rfl ,  and  using  the  expressions  for  these  parameters  given  in  (4.5a)-(4.5c): 


crn2  =  (1  —  rrjji  ))  a-g(n)  +  ttjt2(ji  )  .  (4.8c) 

(Note  that  (4.8a)-(4.8c)  do  not  yield  valid  estimates  of  the  parameters  for  n  <  n*  .) 

Step  5  :  Constrain  ( An  .  Kn  .  cr„  )  to  lie  in  the  parameter  set  of  interest.  First,  extend  the  boundary 
of  A  slightly  (by  a  factor  of  1.1)  in  two  of  its  coordinates  to  obtain  the  following  set  A*  contain¬ 
ing  A': 

A*  &  ( AJC  ,cr2):  9.09  x  10”3  <  A  **  1.1,  9.09  x  10-7  <J<1.1X  10'2,  cr2  >  0  . 


Secondly,  let 

P*  A  An-  9.09  X  10-3, 
f K 1  4  1.1  -An. 
p  3  A  Kn  -  9.09  X  10"7, 
p*  A  1.1  XI 0~2-xn. 

Thirdly,  modify  (An ,  Kn  ,  or2)  as  follows  to  obtain  the  constrained  (to  lie  in  A* )  estimate 
( An  ,  Kn  ,  cr  2)  of  (A  .  K .  cr2)  for  the  n  -th  iteration: 
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'l*  = 


(1  —  max  Ugn  /3^.  0})(9.09  X  10  3) 
+  (1  —  max  Ugn  (3  2,  0})(l.l) 


if  &X  >  0 


if  &n&n  <  0 


(4.9a) 


K.  = 


(1  —  max  {rgn  /3^,  0})(9.09  X  10  7) 
l  +  (l  —  max  {sgn  p*.  0})(1.1  X  10  2) 


if  PnPn  >  0 


if  fiX  <  0 


(4.9b) 


_ 2  ~  2 

cr  =  crn 

n  n 


(4.9c) 


_ .  _  _ 2  *  **»  ^  2 

Now,  if  (A,,  ,  ,  crn)  =  (An  ,Kn.a‘n),  then  execution  of  this  step  is  complete.  However,  if 
(An  ,  ,  an)  r*  (An  ,Kn  ,6rn).  then  proceed  as  follows:  Using  the  inverse  relations  given  in 
(4.4a)-(4.4c).  modify  •n'1(n  ),  <xB (n  ),  and  -n^O; (n  )  at  the  n  -th  iteration  to  reflect  the  above  changes 
in  the  estimates  of  the  Class  A  parameters.  This  modified  estimate  will  be  denoted  by 
(jr^n  ).  o=j(n  ).  Tr^f(n  )): 


7rx(n  )  =  1  —  e 


-a. 


(4.10a) 


and 


o-f(n)  =  or2 


An  +  Kn 


(4.10b) 


2  r  -\  —  2 

7r1cr/(n)  =  o-n 


An  +  irl(n')Kn 


A„  +  K„ 


(4.10c) 


«•  «,  2  » — 'in »  2  2 

Finally,  set  ( ir1(n )  ,  ors(n).  7^07(71))  =  (  n.  )  .  crfl(n).  77707(71))  and 


(An.Kn.&2)  =  (An,Kn.cr^. 


The  steps  of  the  basic  decision-directed  algorithm  are  now  complete.  From  a  graphical  stand¬ 
point,  this  is  what  the  BDD  algorithm  attempts  to  do:  Let  (A  .  K ,  cr2)  denote  a  parameter  vector  in 
A'  with  corresponding  envelope  pdf  vrm  as  shown  in  Fig.  4.1.  Now.  for  each  (A,K.  cr2)  €  A' , 


2  4 

Fig.  4.1.  Envelope  pdf  for  typical  parameter  vector  (  A  .  K  .  cr  )  in  A 
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/  ( z;A.K,c [2)  as  defined  in  (4.6)  is  a  strictly  increasing  continuous  function  of  z.  with 
lim  /  {z  ;  A  .  K  ,  cr2)  <  1  and  lim  /  (z  :  A  ,  K.  <72)  =  eo.  Thus,  for  each  (  A  ,  K  ,  cr2)  €  A*,  there 


1  -  o 


exists  a  unique  ;  6  (0.  oo)  for  which  /  ~ A  .  K ,  or2)  =  1.  It  then  follows  that  for 

each  (  A  .  K .  cr  )  €  A  ,  the  corresponding  likelihood  ratio  test  for  a  given  observation  is  equivalent 

to  comparing  the  given  observation  to  the  optimum  threshold  \  i.e.,  the  decision  regions 

which  minimize  the  probability  of  error  in  the  classification  process  for  each  (  A.K .  <r2)  6  A'  con¬ 
sist  of  the  intervals  (0,  .  (t0^ j‘-‘-  *  ,  oo).  (If  the  observation  lies  in  (0.  ^].  it  is 

classified  as  a  background  sample,  whereas  if  the  observation  lies  in  (r0(p~' ~  \  oo),  it  is  classified 
as  an  impulse.)  Consequently,  since  {A.K,  <r2)  lies  in  A  .  we  can  associate  with  the  parameter 
vector  {A.K.  cr2)  the  optimum  threshold  r0^  -K •<T  ^  in  the  manner  just  described.  Similarly,  since 
the  sequence  of  estimate  vectors  (An  .  Kn  .  <r2)  of  the  true  parameter  vector  {A.K.  cr2)  lies  in  A  . 
we  can  associate  with  this  sequence  of  estimate  vectors  a  corresponding  sequence  of  threshold  esti- 


( \n .  in .  vh 

mates  T  *  n  "  .  Implicitly,  via  this  sequence  of  threshold  estimates,  the  BDD  algorithm 
attempts  to  locate  T^t  -x  ■ar  ^  and.  hence,  the  corresponding  optimum  decision  regions 

co.t^-'Y(t£-^>  ,oo).  In  so  doing,  it  can  then,  with  minimum  error  probability,  discrim¬ 
inate  between  those  samples  corresponding  to  the  main  lobe  of  the  envelope  pdf  and  those 
corresponding  to  the  tail  of  the  pdf  (see  Fig.  4.1).  Given  that  the  background  samples  (those 
corresponding  to  the  main  lobe  of  the  envelope  pdf)  can  be  optimally  discriminated  from  the 
impulsive  samples  (those  corresponding  to  the  tail  of  the  pdf),  accurate  estimates  of  the  Class  A 
model  parameters  can  then  be  obtained. 

We  see  then  that  the  BDD  algorithm  is  physically  motivated,  easy  to  implement,  and  is  a 
recursive  version  of  a  batch  procedure  which  is  known  to  provide  good  estimates  of  the  parameters. 
In  the  next  section,  we  will  examine  the  performance  of  this  BDD  algorithm. 
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4.2.2.  Performance  of  BDD  algorithm 

The  behavior  of  the  BDD  algorithm  has  been  examined  for  an  extensive  range  of  true  parame¬ 
ter  vectors  (A  .  K .  <r2)  6  A'  and  an  equally  extensive  range  of  initiation  vectors  (A0,  K0.  erg)  6  A 
for  each  true  parameter  vector.  A  few  major  difficulties  have  been  observed,  these  difficulties  being 
equally  apparent  when  cr —  1  and  5’n  is  fixed  to  have  unity  value  for  all  n  ^  0.  Thus,  for  the 
sake  of  simplicity,  we  will  now  cite  these  difficulties  as  they  pertain  to  the  situation  when 
<T„  =  cr2  =  1  (n  ^  0).  We  note  that  cr2  is  taken  to  have  unity  value  since  the  absolute  value  of  cr2 
has  no  bearing  on  the  estimation  problem  at  hand  (see  Appendix  A). 

Drawbacks  of  BDD  Algorithm 

(0  For  values  of  A  close  to  10-2  and  arbitrary  values  of  K ,  the  convergence  of  the  BDD  algo¬ 
rithm  to  the  true  parameter  vector  is  sensitive  to  the  distribution  of  impulses  over  values  of 
n  <  O  (500).  That  is,  despite  the  fact  that  the  algorithm  may  correctly  distinguish  between 
impulsive  and  background  samples  in  its  initial  stages,  if  the  percentage  of  impulses  over  the 
sequence  of  samples  classified  correctly  exceeds  the  expected  percentage  (which,  e.g.,  would  be 
0.995%  for  A  =10~2  since,  for  A  =10~2.  tr1  =  1 —e  A  =  9.95  X  10  3),  then,  with  significant 
probability,  the  algorithm  will  not  converge  to  the  true  parameter  vector.  Furthermore,  for 
fixed  A  ,  the  frequency  with  which  the  algorithm  does  not  converge  to  the  true  parameter  vec¬ 
tor  due  to  its  sensitivity  to  the  distribution  of  impulses  increases  with  increasing  K ,  becoming 

_2 

relatively  high  for  values  of  K  close  to  10  . 

(ii)  For  values  of  F0  4  K0  /A0^  T,  the  frequency  with  which  the  algorithm  converges  to  the  true 
parameter  vector  is  relatively  high,  whereas  for  values  of  T0  <  T,  the  frequency  with  which 
the  algorithm  converges  to  the  wrong  parameter  vector  is  relatively  high.  The  former  portion 
of  this  statement  is  not  valid  for  values  of  A  close  to  10-2,  since,  as  explained  in  (i),  the  dis¬ 
tribution  of  impulses  forces  the  algorithm  to  converge  to  the  wrong  parameter  vector  for  a 

—2 

significant  percentage  of  the  runs  when  A  is  close  to  10  .  Moreover,  for  values  of  A  close  to 
1.  the  frequency  with  which  the  algorithm  converges  to  the  wrong  parameter  vector  once 
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again  becomes  relatively  high  for  values  of  f0  —  0  (  ^  10  *).  Thus,  even  when  the  set  of 
vectors  from  which  the  initiation  vectors  are  chosen  is  a  small  neighborhood  of  the  true 
parameter  vector,  the  performance  of  the  BDD  algorithm  can  vary  drastically  over  that  neigh¬ 
borhood  depending  on  the  location  of  the  initiation  vector.  This  highly  nonuniform  behavior 
of  the  algorithm  even  when  the  initiation  vectors  lie  close  to  the  true  parameter  vector  makes 
it  undesirable  for  use  even  as  a  local  tracking  scheme. 

A  careful  and  thorough  analysis  of  the  source  of  these  two  shortcomings  of  the  BDD  algo¬ 
rithm  has  been  made,  and  is  given  in  Appendix  A.  We  note  here  that  the  ensuing  adverse  effects  on 
the  performance  of  the  algorithm  arising  from  its  two  basic  drawbacks  can  be  eliminated  by  plac¬ 
ing  certain  restrictions  on  the  form  of  its  initiation  vector  and  by  incorporating  the  appropriate 
modifications  into  its  framework.  In  particular,  the  following  restrictions  Con  (A  0,  K  Q,  cr0  ))  and 
modifications  must  be  imposed  (the  derivation  of  these  conditions  can  be  found  in  Appendix  A): 

(Rl)  A0  must  either  provide  an  accurate  estimate  of  A  ,  or,  A0  must  provide  an  estimate  of  A 
for  which  A  0<  A  and  not  less  by  an  order  of  magnitude  or  more. 

K0  must  either  provide  an  accurate  estimate  of  K ,  or,  K 0  must  provide  an  estimate  of  K 
for  which  K 0>K  and  not  greater  by  two  orders  of  magnitude  or  more. 

*•2  2 
cr0  must  provide  an  accurate  estimate  of  cr  . 

The  estimator  of  cr2  given  by  (4.8c)  must  be  replaced  by  an  estimator  of  <J2  consisting  of  an 
update  equation  for  <r02. 

(M2)  The  estimate  of  A  must  be  fixed  to  its  initial  value  A  0  in  the  initial  stages  of  the  algorithm. 

2 

with  only  the  estimates  of  K  and  O'  being  updated. 

In  the  next  section,  we  will  consider  a  modified  BDD  algorithm  which  incorporates 
modifications  (Ml)  and  (M2)  and  whose  initiation  vector  satisfies  conditions  (R1)-(R3). 


(R2) 

(R3) 

(Ml) 
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4.3.  A  Global  Decision-Directed  Algorithm 

In  this  section,  we  will  develop  a  global  recursive  estimator  of  the  Class  A  parameters  by 
appropriately  modifying  the  BDD  algorithm.  First,  we  will  present  an  initiation  procedure  for  each 
parameter  which  yields  initial  estimates  of  the  parameter  satisfying  the  corresponding  restriction  as 
given  in  (Rl)-(R3)  of  the  previous  section.  Then,  we  will  propose  a  modified  BDD  algorithm  which 
incorporates  the  changes  described  under  (Ml)  and  (M2)  and  additional  changes  which  are  deemed 
necessary  either  for  the  sake  of  simplifying  the  BDD  algorithm  at  a  given  step  or  for  the  sake  of 
improving  its  performance  in  a  global  framework. 

Initiation  Procedure  for  A 

We  need  to  locate  an  initiation  procedure  which  yields  estimates  A0  satisfying  restriction 
(Rl).  The  search  for  such  an  estimator  can  be  decomposed  into  two  steps:  (i)  First,  we  will  locate 
a  procedure  which  provides  us  with  a  reasonable  estimate  of  A  .  (ii)  Secondly,  we  will  construct  a 
quantizer  whose  input  is  this  estimate  and  whose  output  is  the  estimate  quantized  in  the  direction 
of  small  A  .  This  quantized  estimate  will  then  be  used  for  A  0  .  By  decomposing  the  search  in  this 
manner,  the  determination  of  a  procedure  which  yields  initial  estimates  of  A  with  the  desired 
property  is  greatly  simplified. 

Let  us  focus  our  attention  on  the  first  step:  Which  estimator  will  provide  us  with  a  reason¬ 
able  estimate  of  A  ?  One  procedure  which  suggests  itself  from  the  batch  estimation  problem  dis¬ 
cussed  in  the  previous  chapter  is  the  method  of  moments.  It  was  seen  in  Section  3.2  that  the  MM 
estimator  based  on  the  fourth  and  sixth  moments  was  highly  inefficient  in  estimating  the  parame¬ 
ters  of  the  model.  However,  this  high  inefficiency  was  due  to  the  insensitivity  of  the  moments  to 
changes  in  the  parameter  K .  In  fact,  the  computed  asymptotic  variances  for  the  normalized  MM 

estimate  of  A  (given  in  Table  3.1)  suggest  that  a  relative  error  in  the  estimate  of  A  on  the  O  (10  *) 

7. 

can  be  attained  using  the  MM  estimator  if  a  samole  size  of  10000  is  used.  Thus,  given  the  many 

2  The  computed  variances  were  based  on  the  assumption  of  a  fixed  envelope  second  moment.  However,  they  should  not 
change  significantly  when  the  second  moment  is  unknown  since,  as  will  be  seen  in  the  sequel,  the  sample  second  moment  is 
extremely  accurate  for  10000  samples. 
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desirable  features  associated  with  the  MM  estimator,  among  which  are  its  recursivity  and  compu¬ 
tational  expediency,  we  will  use  the  MM  estimator  based  on  10000  samples  to  obtain  a  reasonable 
estimate  of  the  parameter  A  . 

In  some  cases,  the  estimates  of  A  obtained  via  the  method  of  moments  may  exceed  the  true  A 
by  a  significant  amount.  Unfortunately,  restriction  (Rl)  does  not  admit  such  estimates  of  A  for 
A  0  .  Thus,  instead  of  using  the  MM  estimate  of  A  for  A  0  .  we  will  instead  use  a  quantized  MM 
estimate  of  A  .  wherein  the  MM  estimate  of  A  is  quantized  in  the  direction  of  small  A  . 

Let  m2,m4  .  and  m6  denote  the  second-order,  fourth-order,  and  sixth-order  sample  moments, 
respectively,  of  a  sequence  of  10000  independent,  unnormalized  Class  A  envelope  samples  and  let 
Amm  denote  the  MM  estimate  of  A  based  on  these  moments.  Then,  (see  3.3a), 

3 

m4 

- 1 

2  (m2)2 

A  MM  ~  2  •  (4.11) 

m6  3m  4 

- +  2 

6(m2)3  2(m2)Z 

Furthermore,  let  6^  (  A  .  K  )  denote  the  asymptotic  variance  of  the  normalized  MM  estimate  of  A 
based  on  the  fourth  and  sixth  moments,  the  expression  for  which  was  derived  in  Section  3.2.3. 
Consider  the  following  simple  quantization  scheme: 

(i)  Divide  (— oo  .  oo)  into  the  subintervals  (— oo.l.lXlO  2),  [  Aj ,  A™**  )  (1  <  j  <  990),  and 
■  «*).  where 

Aj  4  j  10-3  +  10-2  (4.12a) 

and 

-2  ^/2 

€A  (A,  .  10  2  ) 

A?"  4  A.  +  - - -  X  A.  .  (4.12b) 

10000 
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(ii)  Then,  take 


10~2 


A 


o 


[  min  A)  ]  —  10 

X  <J  <990 

*■*  ![Aj  .a  “«)(^WAf)  =  1 
1 


if  Amm  6  (-00 . 1.1  x  10  2) 
if  Amm  6  [1.1  X  1<T2,A™X)  . 

^  Amm  ^  990  •  °°) 


(4.13) 


where  IB  denotes  the  indicator  ftmction  of  the  set  B  . 

Note  that  A™**  is  an  upper  bound  which,  in  the  worst  case,  is  correct  approximately  85%  of 
the  time  (based  on  a  Gaussian  approximation  for  the  distribution  of  AMM  ;  see  Section  3.2.2).  Thus, 
the  above  quantization  scheme  achieves  the  goal  of  quantizing  toward  smaller  values  of  A  without 
undue  distortion.  This  statement  is  further  supported  by  simulation  results  wherein  the  proposed 
initiation  procedure  was  in  fact  seen  to  yield  estimates  A0  satisfying  restriction  (Rl).  (Note  that 
the  quantization  scheme  must  be  independent  of  the  parameter  K  since  K  is  unknown.  Now,  from 
Table  3.1,  it  is  evident  that  for  fixed  A  and  arbitrary  K.  £a{A.K')  is  largest  for  K  —  10-2. 
Thus,  the  choice  of  K  —  10-2  in  (4.12b)  yields  the  most  conservative  bound.) 


Initiation  Procedure  for  K 

We  need  to  locate  an  initiation  procedure  for  K0  which  satisfies  restriction  (R2).  Again,  we 
decompose  the  search  for  such  a  procedure  into  two  steps:  First,  we  will  locate  an  estimator  which 
yields  an  estimate  of  K  which  differs  from  K  by  less  than  an  order  of  magnitude.  Then,  we  will 
multiply  this  estimate  by  a  factor  of  10  and  use  the  resulting  value  for  K0.  In  so  doing,  we  will 
obtain  values  for  K0  which  satisfy  (R2). 

Now.  we  need  an  estimator  which  will  approximate  K  to  within  an  order  of  magnitude.  It 
was  seen  in  Section  3.4  that  estimators  which  correctly  distinguish  between  impulsive  and  back¬ 
ground  samples  provide  good  estimates  of  the  Class  A  parameters.  Moreover,  it  was  seen  in  our 
earlier  discussion  that  correct  discrimination  of  the  samples  involves  the  determination  of  the 
optimum  threshold  corresponding  to  the  true  parameter  vector.  Thus,  by  considering  a  recursive 
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version  of  the  Threshold-Comparison  estimator  (given  in  Section  3.4)  which  utilizes  a  simple, 
heuristic  scheme  for  approximating  the  threshold,  perhaps  we  can  obtain  an  estimator  which  yields 
estimates  of  K  having  the  desired  property.  With  this  in  mind,  consider  the  following  initiation 
procedure  for  obtaining  K 0  : 


Let  Z_qq9,  Z_998  ,  .  .  .  ,  Z0,  Zj  ,  .  .  .  ,  Z 20oo  denote  a  sequence  of  3000  independent  Class  A 
envelope  samples. 

Step  1:  Choose  (  y(0)  ,  cra2(0)  ,  -ycr  2(0)  )  arbitrarily  and  let 

X0  A  max  {Z_qqq  ,  .  .  .  ,  Zq}  , 

Zq  4  min  {Z  . .  Zj}  . 

Step  2:  For  0  <  k  <  1999.  update  the  threshold  rk  using  the  following  system  of  equations: 


%k  +1  ~  Xk  + 


rk  =  (XkYt?n 

X(  +1  —  Tk  ) 

k 

[£  X(Zt+1-  r,)]  +  1 


C  zk  +1  xk  ] 


[  1  X(  Zk  +1  rk  )  ] 

Yk  +1  ~  Yk  +  -  t  Zk  +1  —  Yk  ]  . 

[  £(l-X(Zm-  r,))]  +  1 


where 


X(x)  = 


1  if  x  >  0 
0  if  x  <  0 


X(  Zk  +1  —  Tk  )  =  1  =>  Zk  +1  is  classified  as  an  impulsive  sample, 
X(  Zt  +1  —  Tk  )  =  0  =>  Zk  +1  is  classified  as  a  background  sample. 
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Step  3:  For  1000  ^  ^  1999.  update  recursively  the  estimates  of  y,  <ra2,  and  >< r62 : 

>  (  *  —  999)  =  y(k—  1000)  +  - - - [X(Zt+1-  rt)  -  >(*-1000)]  .  (4.14a) 

*  -  999 


<r2(*-999) 


<r a(*  -  1000)  + 

(1— \(Zt+1—  Tt)) 

— -  t2- 

£  (1  -X(Z1+1-  )  ) 

1=1000 


k 

if  £  (l-X(Z,+1~ri))^0 


<r2(*-1000)] 


(4.14b) 


ora2(  *  — 1000) 


k 

if  £  (l-X(Z,+1-r,))  =  0. 
1=1000 


and 


ycr^  (*  —  999)  =  ycr^k  -  1000)  + 

~ [X(Zt+1-  rk  )  Zt2+1  -  ^2(*-1000)]  . 
*-999 

Step  4:  Compute  as  follows  : 


First,  let 


K*  =  [—  In  (l  ->(1000))] 


a2(1000) 

ycr62(lOOO)  -  >(1000)  cf-a2( 1000) 


Then,  let 


(4.14c) 


(4.15) 


K**  =10  K*  . 

Finally,  take 


10"2 

if 

K**  >  10"2 

II 

o 

K** 

if 

K**  €  [10-6.  10"2]  . 

(4.16) 

10“* 

if 

K**  <  10-6 
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This  initiation  procedure  attempts  to  locate  the  optimum  threshold  corresponding  to  the  true 
parameter  vector  via  the  sequence  of  threshold  estimates  rk .  Note  the  simplicity  of  the  scheme 
used  to  approximate  this  optimum  threshold:  Xk  is  the  average  of  X0  and  the  values  of  those  sam¬ 
ples  classified  as  impulsive  by  the  k  -th  iteration:  Yk  is  the  average  of  Y 0  and  the  values  of  those 
samples  classified  as  background  by  the  k- th  iteration:  and  Tk  is  simply  the  geometric  mean  of 
these  two  quantities.  Note,  moreover,  that  (4.14a)-(4.14c)  are  based  on  the  update  equations  for 
the  BDD  algorithm  given  in  (4.7a)-(4.7c)  and  that  the  relation  for  K*  given  in  (4.15)  is  obtained 
using  relation  (4.8b).  (The  interpretation  of  the  update  parameters  y.  cra2.  and  ycr2  is  the  same  as 
that  for  the  parameters  irvcrg,  and  7r1<r/2.  respectively.)  Now,  simulations  reveal  that  for  values 
of  the  true  parameter  vectors  for  which  K  ^  10~\  K*  approximates  K  to  within  an  order  of 
magnitude.  However,  even  though  K*  is  less  than  K  by  less  than  an  order  of  magnitude  for 
parameter  vectors  for  which  K  >  10-”4,  K*  sometimes  exceeds  K  by  more  than  an  order  of  magni¬ 
tude  for  these  parameter  vectors.  But,  by  multiplying  K*  by  a  factor  of  10,  constraining  the 
resulting  value  to  lie  within  the  set  of  allowable  values  for  the  parameter  K ,  -nd  using  this  con¬ 
strained  value  for  K  0.  it  is  seen  that  estimates  K0  which  satisfy  restriction  (R2)  are  obtained. 

2 

Initiation  Procedure  for  cr 
—  2 

For  cr0 ,  we  will  use  the  sample  second  moment  of  those  samples  used  to  obtain  AMM .  Let  m2 
be  defined  as  above.  Then,  we  shall  take 

&o  =  m2  ■  (4.17) 

Simulations  have  shown  that  the  sample  second  moment  based  on  10000  samples  yields  accurate 
estimates  of  the  envelope  second  moment.  Thus,  estimates  O'q  obtained  using  (4.17)  do.  in  fact, 
satisfy  restriction  (R3). 

The  descriptions  of  the  initiation  procedures  for  the  parameters  are  now  complete.  We  will 

now  determine  the  changes  in  the  BDD  algorithm  induced  by  (Ml)  and  (M2).  First,  consider  (Ml). 

—  2 

Since  cr0  is  the  sample  second  moment  of  a  sequence  of  10000  Class  A  envelope  samples,  the 
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estimator  of  cr2  given  by  (4.8c)  will  be  replaced  by  the  following  update  equation  for  this  sample 


second  moment: 


=  Z2n_x  +  - (  Z2  -  &2_x  ).  n>l. 

n  +  10000 


(4.18) 


Now,  consider  (M2).  Let  denote  the  last  iteration  for  which  the  estimate  of  A  is  fixed  to  its 
initial  value  A  0  .  Then,  for  n  <  .  the  estimator  of  A  given  by  (4.8a)  must  be  replaced  by  the 

following  estimator: 


A,  = 


(4.19) 


Now,  from  (4.5a)  and  (4.5b),  note  that 


°7  “  *B 


Thus,  for  n  ^  nf  .  the  following  estimator  of  K  can  be  used: 


K  = 


A  o  OTg(n) 

l-  e~Ao  CT/(n)  -  07(71) 


(4.20) 


where  the  update  equation  for  07  is  given  by  (4.7b)  and  the  update  equation  for  cr2  is  given  analo¬ 
gously  as  follows  : 


6r/(n  —  1 )  + 


(Z„2-  <77(71 -1))  if£0z?*O 


a-2 in )  = 


(4.21) 


67  (n  -  1  ) 


if  Z  4>i  ~  0 


<x y  (0)  €  £ .  (If  n"  is  defined  to  be  the  first  iteration  for  which  a  given  sample  is  classified  as  impul- 
sive.  then  for  n  ^  71"’.  O’;  (n  )  is  simply  the  sample  second  moment  of  those  samples  classified  as 
impulsive  by  the  n  -th  iteration.)  Note  from  the  estimators  of  the  Class  A  parameters  given  in 
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(4.18)-(4.20)  that  only  the  estimates  of  the  two  update  parameters  crj  and  <j/  are  required  when 
n  ^  nf  .  Consequently,  in  lieu  of  the  update  parameters  v1,<r^,  and  iTjcr/  used  previously,  the 
update  parameters  crj  and  trf  will  be  used  when  n  ^  nf  . 

We  are  now  in  a  position  to  present  the  following  modified  BDD  algorithm  which  incorporates 

■ 

the  above  modifications  and  some  additional  minor  modifications. 

43.1.  Modified  BDD  (MBDD)  algorithm 

Step  1  :  Choose  the  initiation  vectors.  Choose  ( (0) ,  arj  (0) )  arbitrarily  and  obtain 

(  A  0  ,  K0  ,  <r0  )  using  the  initiation  procedures  described  above. 

Step  2' :  Classify  sample  as  impulsive  or  background.  Since  m  ^  1  in  (4.6)  and 

1CT6  ^  K  ^  10~2,  it  follows  that  m  +  K  =  m .  Using  this  approximation  for  (  m  +  K )  in  (4.6), 

classify  Zn  ,  n  ^  1.  as  an  impulsive  sample  or  as  a  background  sample  using  the  following 
simplified  test: 

Let 


1  if  zn  >g(i„_x,in_1.^2_1) 
0  if  z„  <  g(i„_1.in_1.o-„2_1) 


where 


1/2 

1 


-  2 

&n-l 

In 

a-  r 

K  T  n~l 

-1 

&n- 1  L 

.  m !  m 

m  ?  1 

and  where 


<f>n  =  1  =>  Z„  is  classified  as  an  impulsive  sample, 
4>n  =  0  =>  Z„  is  classified  as  a  background  sample. 
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Step  3  :  Update  recursively  the  appropriate  update  parameters. 

(a)  For  n  <  nf  ,  update  recursively  the  estimates  of  erf  and  erf  using  (4.7b)  and  (4.21),  respec¬ 
tively. 

(b)  For  n  >  rif  .  update  recursively  the  estimates  of  nv  erf.  and  rr^f  using  (4.7a),  (4.7b),  and 

(4.7c),  respectively.  Note  that  these  update  equations  require  estimates  of  ttv  erf.  and  7 r^erf 

2  <■•2 

when  n  =  nf  .  Since  crfl  is  an  update  parameter  when  n  ^  nf  .  <xB(  nf  )  is  available.  Fur¬ 
thermore,  using  relations  (4.4a)  and  (4.4c),  estimates  of  7Tj  and  at  the  nf  -th  iteration 

can  be  obtained  as  follows: 

TT\(nf)  —  1  —  e  ~*nf  =  1—  e~^°  , 

An  +  ttx( nf  )Kn  A0+  (1  -e~A°)Kn 

~  2,  ,  ~  2  t  1  t  -  2  / 

TrlaI(nf  )  =  <r  - - - : -  =  <r  - - - r - 

'  Anf  +  Knf  r  A0+  Knf 

Step  A,' :  Obtain  estimates  of  the  Class  A  parameters. 

Let  an  and  n*  be  defined  as  in  Step  4  of  the  BDD  algorithm. 

(a)  For  n  <  n* ,  obtain  crn  using  (4.18)  and  set  (  An  ,  Kn  )  =  (  An  ,  Kn  _x  ). 

(b)  If  n*  ^  nf  ,  then  for  n*  ^  n  <  .  obtain  estimates  for  A  K ,  and  cr2  using  (4.19),  (4.20), 

and  (4.18),  respectively. 

(c)  For  n  >  mai(?i/  . n*—  1  ),  obtain  estimates  for  A,  K,  and  cr2  using  (4.8a),  (4.8b),  and 
(4.18),  respectively. 

Step  5  :  Constrain  estimates  of  Class  A  parameters  to  lie  in  parameter  set  of  interest. 

Define  A*  .  (if  .  fi  2  ,  /3 f  .  and  (if  as  in  Step  5  of  the  BDD  algorithm. 

M  M  2 

(a)  For  n  ^  nf  .  constrain  An,  Kn.  and  <x„  using  (4.9a),  (4.9b),  and  (4.9c).  respectively.  (  Let 

An.  Kn  .  and  or2  denote  the  constrained  estimates.  )  If  (An  ,  Kn  .  <f„2)  =  (An  ,  Kn  ,  erf),  then 

execution  of  this  step  is  complete.  However,  if  ( An  ,Kn,<rn)?±  ( An  ,Kn.ern  ).  then  proceed  as 

w  2  2 

follows  :  Using  the  relations  given  in  (4.4a)-(4.4c),  modify  a- 3(  n  )  and  a-;  (  n  )  to  reflect  the 
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2  2  2 
change  in  the  estimates  of  A  .  K .  and  cr  induced  by  the  constraint.  (  Let n  )  and  cr7  (  n  ) 

denote  the  modified  estimates.  )  : 


cr^Cn)  =  cr; 


_ 2 


K. 


A.  +  K 


_ 2 

=  O', 


^o+  Kn 


crjin)  = 


_ 2 

cr. 


A„  +  Cl-e~A"  )Kn 


(l-e  A «  )(A„  +  Jfa) 


_ 2 

=  cr. 


A0  +  (l-e~*°)if„ 


(l-e  ^#)(i0  +  i:n) 


(b) 


Finally,  set  ( <xj(  n  )  .  n  )  )  =  (  cr^  ( /i  )  ,  c r/(  n  )  )  and  (  A„  ,  ifrt  .  or^2  )  =  ( An  ,  .  crn2  )  . 

For  n  >  n.f  .  proceed  as  in  Step  5  of  the  BDD  algorithm. 


The  description  of  the  MBDD  algorithm  is  now  complete.  Experimentation  has  shown  that 
setting  nf  to  1000  eliminates  the  difficulties  associated  with  not  fixing  the  estimate  of  A  in  the  ini¬ 
tial  stages  of  the  algorithm,  without  unduly  slowing  the  convergence  of  the  algorithm.  Moreover, 
extensive  simulation  of  the  MBDD  algorithm  (with  Uf  =  1000)  has  shown  that  convergence  of  the 
algorithm  is  essentially  attained  within  5000  iterations  (i.e.,  the  relative  variation  in  the  estimates 
of  the  parameters  from  iteration  to  iteration  for  iteration  values  near  5000  is  very  slight)  and  that 
good  estimates  of  all  parameter  vectors  in  the  parameter  set  of  interest  can  generally  be  obtained 
for  this  iteration  value.  Occasionally,  however,  for  values  of  A  ~  0(^  10-1)  (X  arbitrary),  the 
estimate  of  A  can  be  somewhat  low.  Furthermore,  for  these  values  of  A  and  values  of 
K  —  0(^  lO-’’).  the  estimate  of  K  is  occasionally  high.  This  problem  can  be  easily  remedied  by 
noting  the  following:  Even  though  the  estimate  of  A  is  low  and  that  of  if  is  high,  these  estimates 
are  closer  to  the  true  values  than  those  given  by  A0  and  K0.  Thus,  by  restarting  the  BDD  algo¬ 
rithm  with  an  initiation  vector  consisting  of  these  estimates  of  A  and  K ,  better  estimates  of  the 
parameters  can  be  expected.  Now.  it  was  noted  that  the  convergence  of  the  MBDD  algorithm  can  be 
attained  within  5000  iterations.  However,  even  after  the  3000-th  it.  ition,  the  variation  in  the 
estimates  of  the  parameters  is  only  slight.  Thus,  the  BDD  algorithm  can  equally  well  be  restarted 
with  the  estimates  of  A  ,  K .  and  c r  obtained  at  the  3000-th  iteration.  With  this  in  mind,  consider 
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the  following  modification  to  the  MBDD  algorithm: 

If  the  estimate  of  A  at  the  3000-th  iteration  exceeds  10-1,  then  introduce  as  a  second  estimate 
of  the  update  parameter  7 rx  the  proportion  of  samples  classified  as  impulsive  after  the  3000-th 
iteration.  Denote  this  second  estimate  by  7 r  v  At  some  iteration  n  miX  >  3000  (and  for  all  itera¬ 
tions  thereafter),  obtain  the  estimate  of  A  using  relation  (4.5a)  and  7r  2  as  an  estimate  of  rr1, 
instead  of  7r1.  If.  in  addition,  the  estimate  of  K  at  the  3000-th  iteration  exceeds  10-4,  then  also 
introduce  as  a  second  estimate  of  <rB  the  proportion  of  samples  classified  as  background  after  the 
3000-th  iteration  and  as  a  second  estimate  of  the  proportion  of  samples  classified  as  impulsive 
after  the  3000-th  iteration  times  the  sample  second  moment  of  those  samples.  Denote  these  esti¬ 
mates  by  cr  g  and  respectively.  Then,  for  all  iterations  n  ^  nmax  .  obtain  the  estimate  of  K 

using  relation  (4.5b)  and  7r  1(  cr  J,  and  7 as  estimates  of  the  corresponding  update  parameters. 
Furthermore,  for  iteration  values  greater  than  3000  and  less  than  nmiX  ,  obtain  the  estimates  of  A 
and  K  using  the  original  estimates  of  the  update  parameters.  Also,  if  the  estimate  of  A  at  the 
3000-th  iteration  exceeds  10  but  that  of  K  does  not  exceed  10  ,  then  for  n  ^  nmax  ,  continue  to 
obtain  the  estimate  of  K  using  the  original  estimates  of  the  update  parameters. 

The  first  iteration  value  n  raax  ,  for  which  the  second  estimates  of  the  update  parameters  are 
used  to  obtain  the  estimates  of  the  model  parameters,  is  the  smallest  iteration  value  which  satisfies 
the  following  two  conditions: 

(i)  There  must  exist  iterations  n  x  and  n2,  3000  <  n  x ,n2  ^  nmax  •  such  that  Z„  is  classified  as  an 
impulse  and  Zn^  is  classified  as  background  since,  otherwise,  invalid  estimates  of  the  Class  A 
parameters  would  be  obtained. 

(ii)  nmax  must  be  greater  than  3000  plus  an  offset  8,  specified  below  as  a  function  of  the  estimate 
of  A  at  the  3000-th  iteration.  For  iteration  values  greater  than  3000  and  less  than  or  equal  to 
3000  +  8,  the  original  estimates  of  the  update  parameters  are  used  to  obtain  the  estimates  of 
the  model  parameters.  Consequently,  this  experimentally  determined  offset  8  is  chosen  so 
that,  for  iteration  values  slightly  greater  than  3000  +  8, 
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(a)  the  accuracy  of  the  estimate  of  A  obtained  using  7r  j  is.  on  the  average,  higher  than 
that  obtained  using  7rlf  and, 

(b)  when  the  second  estimates  of  the  update  parameters  are  used  to  obtain  the  estimate  of 
K .  the  accuracy  of  this  estimate  is.  on  the  average,  close  to  or  higher  than  that 
obtained  using  the  original  estimates  of  the  update  parameters. 

Otherwise,  it  would  be  more  appropriate  to  continue  using  the  original  estimates  of  the  update 
parameters  in  obtaining  the  estimates  of  the  model  parameters. 

By  implementing  the  above  modification  to  the  MBDD  algorithm,  what  is  effectively  being 
done  is  the  following:  At  the  3000-th  iteration  of  the  MBDD  algorithm,  the  BDD  algorithm  is 
restarted  alongside  the  MBDD  algorithm,  using  as  estimates  of  the  model  parameters  in  the  initia¬ 
tion  vector  for  the  restarted  algorithm  the  estimates  of  these  parameters  obtained  from  the  MBDD 
algorithm  at  the  3000-th  iteration.  Then,  at  some  well-defined  iteration  after  the  3000-th  iteration 
(and  for  all  iterations  thereafter),  the  estimates  of  A  and  K  obtained  from  the  BDD  algorithm  are 
sometimes  used  instead  of  those  obtained  from  the  MBDD  algorithm.  The  estimates  of  the  parame¬ 
ters  obtained  from  the  restarted  algorithm  are  used  whenever  it  is  expected  that  these  estimates 
will,  on  the  average,  and  within  a  moderate  sample  size,  provide  better  estimates  of  the  parameters 
than  those  offered  by  the  MBDD  algorithm. 

The  proposed  modification  to  the  MBDD  algorithm  should  yield  an  algorithm  which  provides 
a  global  estimator  of  the  Class  A  parameters  for  all  parameter  vectors  in  the  parameter  set  of 
interest.  The  steps  of  this  modified  MBDD  algorithm  will  now  be  given. 

43.2.  Global  decision-directed  algorithm 

Step  :  Choose  the  initiation  vectors.  Proceed  as  in  Step  1*  of  the  MBDD  algorithm.  In  addi- 
tion.  choose  ir  1(0) ,  <r  B  (0),  and  (0)  arbitrarily. 

Step  2'  :  Classify  sample  as  impulsive  or  background.  Proceed  as  in  Step  2'  of  the  MBDD  algo¬ 


rithm. 
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Step  3  :  Update  recursively  the  appropriate  update  parameters 

Let 


1  if  3000  >0.1 


0  if  A  3000  ^0.1 


1  if  K  3000  10 


0  if  K <  10‘ 


500 

if 

0.1  < 

^  3000  <  0.2 

400 

if 

0.2  < 

3000  ^  0.3 

300 

if 

0.3  < 

^  3000  ^  0.4 

200 

if 

V/ 

■'T 

o 

3000  ^  0.5 

100 

if 

0.5 

■A  3000  ^ 

Furthermore,  let  n**  denote  the  minimum  value  of  n  .  n  >  3000,  for  which  €n  ^  0,  where 


€„  A  (  z  0|)(  I  (1-^)). 

I  =3001  l  =3001 


n  >  3000 


(a)  For  n  ^  1000,  proceed  as  in  Step  3  (a)  of  the  MBDD  algorithm. 

(b)  For  1000  <  n  ^  3000,  proceed  as  in  Step  3  (b)  of  the  MBDD  algorithm. 

(c)  For  n  >  3000, 

Case  (  i  )  :  fi  —  0. 

Continue  to  recursively  update  ir1,crB.  and  using  (4.7a),  (4.7b),  and  (4.7c), 


respectively. 


to  Ki 
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Case  (  ii  )  :  /jl  =  1  and  r\  =  0  . 

(1)  For  3000  <  n  ^  max  (  3000  +  S  ,  n**  —  1  ),  continue  to  recursively  update  rry  ,  cr^ .  and 

tt  1cr i  using  (4.7a).  (4.7b).  and  (4.7c),  respectively.  In  addition,  update  recursively  the 
estimate  tt  1  as  follows  : 

wXn  -3000)  =  n  .(  n  -3001  )  +  - - - (A  -  v  .(  n  -3001  ))  .  (4.22) 

n  -  3000 

(2)  For  n  >  max  (  3000  +  5  ,  n **  —  1  ),  continue  to  recursively  update  the  estimates  tt1, 
o’J,  tt1<s2  ,  and  m  j  using  (4.7a),  (4.7b),  (4.7c).  and  (4.22),  respectively. 


Case  (  iii  )  :  n  =  1  and  t)  =  1  . 

(l)  For  3000  <  n  ^  max  (  3000  +  S  ,  n**  —  1  ),  continue  to  recursively  update  ttv  crj ,  and 

tt-Pi  using  (4.7a),  (4.7b),  and  (4.7c),  respectively.  In  addition,  update  tt  :  using 

.  ^  2  ^  2 

(4.22)  and  update  recursively  the  estimates  cr  B  and  t txcti  as  follows  : 

(1-0 


(  n  -  3000  )  = 


cr  tin  -3001  )  + 


(Z„  -cr|(n  -3001))  if  £(1-0;)^O 

I  (1 -<*>,) 

(4.23) 


l  =3001 


laa(n-300l)  ^  L  ( '- )  =  0 

1= 3001 

and 

iryrfC  n  -  3000  )  =  vyr,2  (  n  -  3001  )  +  - - - (  <f>n  Z2  -  -n n  -  3001  )  )  .  (4.24) 

n  —  3000 

(2)  For  n  >  maxi.  3000  +  8  ,  n**  —  1  ),  continue  to  recursively  update  the  estimates  tt  t, 
a  3  ,  and  ttx<j?  using  (4.22).  (4.23),  and  (4.24).  respectively. 
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Step  4  Obtain  estimates  o  f  the  Class  A  parameters. 

Let  an  and  n*  be  defined  as  in  Step  4  of  the  BDH  algorithm. 

(a)  For  n  <  n*  .  proceed  as  in  Step  4  (a)  of  the  MBDD  algorithm. 

(b)  If  n*  ^  1000,  then  for  n*  ^  n  ^  1000,  proceed  as  in  Step  4  (b)  of  the  MBDD  algorithm. 

(c)  If  n*  ^  3000,  then  for  max  (  1000  ,  n*  —  1  )  <  n  <  3000.  proceed  as  in  Step  4  (c)  of  the 
MBDD  algorithm. 

(d)  Case  (  i  )  :  /i  =  0  . 

For  n  >  max  (  3000  ,  n*  —  1  ),  proceed  as  in  Step  4  (c)  of  the  MBDD  algorithm. 


Case  (  it ) :  /x  =  1  and  t)  =  0  . 

(1)  If  ft*  ^  max  (3000  +  S  .  n **  —  l)  .  then  for  max  (3000  .  n*  —  l)  <  n  ^  max  (3000  +  8. 
n **  —  1)  ,  proceed  as  in  Step  4*(c)  of  the  MBDD  algorithm. 

(2)  For  n  >  max  (  3000  +  8  ,  n**  —  l),  obtain  estimates  of  K  and  cr2  using  (4.8b)  and 

(4.18) .  respectively.  Then,  using  the  relation  given  in  (4.5a).  obtain  the  estimate  of  A 
as  follows  : 

An  =  -  In  (  1  -  n  x  (  n  -  3000)  )  .  (4.25) 

Case  (  Hi  )  :  fj.  =  1  and  tj  =  1  . 

(1)  If  n*  ^  max  (3000  +  8  ,  n**  —  1)  ,  then  for  max  (3000  .  n*  —  l)  <  n  ^  max  (3000  +  8, 

n **  —  1)  ,  proceed  as  in  Step  4  (c)  of  the  MBDD  algorithm. 

(2)  For  n  >  max  (  3000  +  8  .  n **  —  1  ).  obtain  estimates  of  A  and  cr2  using  (4.25)  and 

(4.18) ,  respectively.  Then,  using  the  relation  for  K  given  in  (4.5b),  obtain  the  estimate 


of  K  as  follows  : 
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o*J(  n  —  3000  ) 

Kn  =  —  lnCl  —  TT^n  -3000))  -  .  (4.26) 

v YTjCn  -  3000)  -t/  n  -  3000  )  <x  n  -  3000  ) 

Step  5'' :  Constrain  estimates  of  Class  A  parameters  to  lie  in  parameter  set  of  interest  . 
D.^ne  A*  .  /3^  ,  ,  0^,  and  /3  *  as  in  Step  5  of  the  BDD  algorithm. 

(a)  For  n  ^  1000.  proceed  as  in  Step  5  (a)  of  the  MBDD  algorithm. 

(b)  For  1000  <  n  ^  3000.  proceed  as  in  Step  5  of  the  BDD  algorithm. 

(c)  For  n  >  3000, 

Case  ( i  )  :  /j.  =  0  . 

Proceed  as  in  Step  5  of  the  BDD  algorithm. 

Case  (ii)  :  p.  —  1  and  7]  =  0  . 

(1)  For  3000  <  n  <  max  (  3000  +  S  ,  n**  —  1  ).  proceed  as  in  Step  5  of  the  BDD  algorithm. 

(2)  For  n  >  max  (  3000  +  S  ,  n**  —  1  ),  constrain  An.  Kn,  and  an  using  (4.9a),  (4.9b),  and 

(4.9c).  respectively.  (Let  An ,  Kn .  and  <r„  denote  the  constrained  estimates.)  If 
(_An  ,  Kn  ,  (Jn)  =  G4„  ,  Kn  ,  <xn).  then  execution  of  this  step  is  complete.  However,  if 

,  cr„  )  G4„  ,Kn.un  ).  then  proceed  as  follows  :  Modify  tr^n  ),  aB  (n  ),  and 

rr^fin  )  using  (4.10a).  (4.10b),  and  (4.10c),  respectively.  (Let  If x(n  ),  cfj(n),  and 
7T jCT- /(n  )  denote  the  modified  estimates.)  In  addition,  using  the  relation  given  in  (4.4a), 
modify  tt  t(  n  —  3000  )  as  follows  (let  7 ft(  n  —  3000  )  denote  the  modified  estimate): 

-3000)  =  1  -  e~A"  . 

Finally.  set  (  irff  n  )  .  <xj(  n. )  ,  ;2(  n  )  )  =  (  7f,(  n. )  .  cfj(  n  )  ,  TTjCr^  n. )  ) 

ir  x(  n  -3000  )  =  WL(  n  -  3000  )  .  and  (  .  ct*  )  =  (An.Kn.W, f  )  . 
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Case  ( Hi  ):  fj.  =  1  and  rj  =  1  . 

(1)  For  3000  <  n  <  max  (  3000  +  8  ,  n**  —  1  ).  proceed  as  in  Step  5  of  the  BDD  algorithm. 

(2)  For  n  >  max  (  3000  +  8  ,  n**  —  1  ).  constrain  An  ,  Kn  ,  and  <r2  using  (4.9a).  (4.9b).  and 

(4.9c).  respectively,  (let  An ,  iLn .  and  ct2  denote  the  constrained  estimates)  If 
( An  .  Kn  ,  <f2)  =  (A„  ,  Kn  ,  a- 2).  then  execution  of  this  step  is  complete.  However,  if 

C4„  .  .  <fn  )  (An  ,  Kn.  O’ n  ),  then  proceed  as  follows  :  Using  the  relations  given  in 

(4.4a)-(4.4c),  modify  tt  —3000),  arJ(n  —3000),  and  Tr^r2{n  —3000)  (let 
if1(n  —3000),  cfgi  n  —3000),  and  ir1cr/2(  n  —  3000)  denote  the  modified  esti¬ 
mates)  : 

ft(n  -  3000)  =  1  -e~A"  , 

An  +  Kn 

7i  n 

and 


o=j(n  -3000)  =  o=n2 


n  -  3000  )  = 


An  +  i FjU  -  3000  ) 


A„  + 


Finally,  set  (  v  x(n  —  3000)  ,  <r  J(n  —  3000)  ,  —  3000)  )  *  (  n x(n  —  3000) 

aj(/i  -  3000)  ,  Wp]{n  -  3000)  )  ,  and  ( Aa  ,  ir„  ,  o'2)  -  (A,  ,  Kn  ,  a„2). 


The  steps  of  the  global  decision-directed  algorithm  are  now  complete.  In  the  following  sec¬ 
tion.  the  performance  of  the  proposed  algorithm  will  be  examined  via  simulation. 

4 A.  Simulation  Results 

In  order  to  assess  the  performance  of  the  proposed  global  decision-directed  algorithm,  an 
extensive  simulation  study  of  the  algorithm  was  made.  First,  the  sample  mean-square  relative 
error  incurred  by  the  initial  estimate  of  a2  was  computed  for  all  (A  ,  K ,  c r2)  in  ft  (defined  in 
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Appendix  A)  using  100  data  sets.  The  results  are  tabulated  in  Table  4.1.  It  is  evident  from  the 
values  of  the  relative  errors  cited  in  this  table  that  the  sample  second  moment  based  on  10000  sam- 
pies  provides  highly  accurate  estimates  of  cr  .  Furthermore,  since  the  estimate  of  cr  at  each  itera¬ 
tion  of  the  algorithm  is  obtained  by  simply  updating  this  sample  second  moment,  the  proposed  glo- 
bal  scheme  provides,  on  the  average,  highly  accurate  estimates  of  cr  at  each  iteration.  Conse¬ 
quently.  in  assessing  the  ability  of  this  scheme  to  estimate  the  remaining  two  parameters  A  and  K , 
2 

the  parameter  cr  was  assumed  to  be  known. 

Using  100  data  sets,  each  containing  a  sequence  of  5000  observations  randomly  generated 
from  the  Class  A  envelope  pdf,  the  1%- trimmed  mean  relative  bias  of  the  estimate  of  A  and  the 
1%- trimmed  mean  relative  bias  of  the  estimate  of  K  were  computed  for  each  (A  ,  K .  ar2)  6  Cl.  Let 
bA  and  bK  denote  these  quantities,  respectively,  and  let  Sj4  and  8*  denote  the  relative  errors  in  the 
estimates  of  A  and  K ,  respectively,  obtained  using  the  J-th  data  set,  i.e.. 


Sf  4 


A1  —  A 


and  8f  A 


K1  -K 


where  ( A 1  .  K J  )  is  the  estimate  of  (  A  ,  K  )  obtained  using  the  j  -th  data  set.  Then, 


Table  4.1.  MEAN-SQUARE  RELATIVE  ERROR  OF  <x02  (cr2  =  1.0) 


K 

A 

10  2 

10~3 

10"“ 

10"5 

10"6 

10~2 

4.7266X10-3 

1.5638X10"2 

1.8560X10"2 

1.8897X10"2 

1.8931X10”2 

o 

) 

H 

■ 

1.5018X10-3 

1.7473X10'3 

1.7755X10'3 

1.7784X10"3 

1.7786X10'3 

1 

3.2742X10-4 

3.3107X10"“ 

3.3144X10”“ 

3.3148X10"“ 

3.3148X10"“ 
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—  (  mas  8A)  —  (  min  8  A) 

J  J 

and 

—  (  max  8f)  —  (  min  Sj*) 

)  J 

The  values  for  these  biases  are  tabulated  in  Tables  4.2  and  4.3.  Note  from  these  tables  that  the 
values  for  bA  and  bK  are  quite  low  for  all  parameter  vectors  under  consideration.  Thus,  the  pro¬ 
posed  global  decision-directed  algorithm  yields  an  essentially  unbiased  recursive  estimator  of  the 
parameters  A  and  K  of  the  Class  A  model. 

In  addition  to  the  relative  biases,  the  following  quantities  were  computed  for  each 
(A  .  K .  cr  )  6  n  using  the  aforementioned  data  sets: 

(i)  the  1%-trimmed  sample  mean-square  relative  error  due  to  estimating  A  , 

(ii)  the  1%-trimmed  sample  mean-square  relative  error  due  to  estimating  K . 
and 

(iii)  the  1%-trimmed  sample  mean-square-norm  relative  error  (MSNRE)  due  to  estimating 
A  and  K . 

Let  eA  .  eK ,  and  etot  denote  the  quantities  described  in  (i),  (ii),  and  (iii).  respectively.  We  note 
that,  for  each  of  these  'luantities,  the  "trimming"  is  based  on  the  exclusion  of  a  data  set  which 
yields  the  highest  value  for  etol  and  a  data  set  which  yields  the  lowest  value  for  etot ,  i.e.,  if 

j *  4  erg  max  l(SA)2+  (Sj1)2 

J  1 

j**  4  arg  min  |(SA)2+  (Sj*) 2  . 

J  1 


*x  4  — 
98 


z  *i 


)  Sl 


bA  A  — 
98 


Z  s; 


and 


I 
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Table  4.2.  1%  -  TRIMMED  RELATIVE  BIAS  OF  ESTIMATE  OF  A  {bA) 

(tr2  =  1.0) 


K 

A 

10~2 

10'3 

10-4 

10_i 

f 

o 

H 

C4 

1 

O 

r-4 

3.8241X10-2 

6.0110X10-2 

6.3540X10-2 

6.4057X10*2 

6.3781X10  2 

10_1 

-9.7836X10"2 

-5.0248X10"2 

-4.2653X10-2 

-4.1699X10-2 

-4.1542X10"2 

1 

— 4.5694X10-2 

— 8.8028X10-3 

-1.6972X10-3 

— 8.5670X10-4 

-7.3438X10 i"4 

Table  4.3.  1%  -  TRIMMED  RELATIVE  BIAS  OF  ESTIMATE  OF  K  {bK  ) 

(o'2  =  1.0) 


K 

A 

10~2 

7 

o 

H 

T 

o 

10“3 

10-6 

10-2 

— 6.4449X10-2 

4.6075X10-2 

5.2917X10"2 

J.3716X10'2 

1.1140X10-1 

10'1 

-3.5629X10-2 

3.1080X10-2 

3.0820X10-2 

3.4567X10-2 

4.8787X10-2 

-3.1074X10 


1.4006X10' 


-9.2976X10' 


-1.6823X10' 


2.0792X10 
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then 


and 


^  = 


98 


100 

z  csj4)2 

-  (S*)2  -  (Sf.)2 

•4 

II 

-> 

£  r 


98 


r\2 


100 
I  (8* ) 

7=1 


CS  jZ  )2  -  (8*)2 


*«<*  ~eA  +  eK‘ 

(For  the  computations  performed  here,  j*  and  /**  were  uniquely  determined.  However,  the 
defining  properties  for  these  two  quantities  does  not  guarantee  this.  If  either  j*  or  /'**  is  not 
uniquely  determined,  then  these  indices  are  chosen  arbitrarily  among  those  that  satisfy  the 
corresponding  defining  property.)  The  computed  values  for  eA.  eK ,  and  etot  are  tabulated  in 
Tables  4.4.  4.5.  and  4.6.  respectively.  As  with  the  relative  biases,  it  is  seen  from  Table  4.6  that  the 
MSNRE  is  quite  low  for  all  parameter  vectors  under  consideration.  Moreover,  from  a  comparison 
of  the  values  given  in  Tables  4.4  and  4.5.  it  is  evident  that  neither  eA  nor  eK  dominates  in  its  con¬ 
tribution  to  etot .  Thus,  in  a  mean-square  error  sense,  not  only  does  the  proposed  estimator  provide 
a  very  good  global  estimator  of  both  parameters  A  and  K ,  but,  in  addition,  it  has  no  difficulty  in 
estimating  one  parameter  over  the  other. 


Lastly,  the  normalized  sample  MSNRE  (  A  5000  X  etot  )  was  computed  (see  Table  4.7)  and 

compared  to  the  Cramer-Rao  Lower  Bound  (CRLB).  (The  values  for  the  CRLB  were  given  in  Table 

3.8.)  Note  that  the  values  for  the  normalized  sample  MSNRE  and  CRLB  are  essentially  on  the  same 

_ 2 

order  of  magnitude.  In  fact,  for  A  =  10  ,  the  values  for  the  normalized  sample  MSNRE  are  quite 
close  to  those  of  the  CRLB.  Furthermore,  a  comparison  of  the  computed  values  for  the  normalized 
sample  MSNRE  of  the  proposed  recursive  estimator  and  the  normalized  sample  MSNRE  of  the 
batch  scheme  upon  which  it  is  based  (see  Table  3.9)  indicates  that  the  proposed  recursive  scheme 
performs  better  than  the  batch  estimator  for  a  sizeable  subset  of  the  parameter  set  under  considera¬ 


tion. 
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Table  4.4.  1%  -  TRIMMED  MEAN-SQUARE  RELATIVE  ERROR  DUE  TO 

ESTIMATING  A  (eA) 


K 

A 

10-2 

n 

1 

O 

H 

o 

i 

10'5 

10-6 

10"2 

1.0896X10-2 

1.4162X10'2 

1.4684X10-2 

1.4761X10-2 

1.4702X10-2 

10"1 

1.3786X10-2 

6.4035X10-3 

6.0285X10"3 

5.8759X10-3 

5.7837X10-3 

1 

3.1014X10  3 

1.1524X10"3 

1.0628X10-3 

1.0558X10"3 

1.0589X10'3 

Table  4.5.  1%  -  TRIMMED  MEaN-SQUARE  RELATIVE  ERROR  DUE  TO 

ESTIMATING  K  (ex  ) 


K 

A 

10-2 

io-3 

10-4 

10_i 

lO^6 

10-2 

1.2353X10-2 

2.1480X10'2 

2.3808X10"2 

2.3974X10-2 

2.8163X10-2 

o 

J 

4.1013X10"3 

8.2767X10'3 

7.4102X10-3 

6.3852X10-3 

7.5959X10-3 

1 

3.8256X10"3 

3.3176X10-3 

2.1203X10-3 

1.7420X10"3 

2.3851X10-3 
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Table  4.6.  1%  -  TRIMMED  SAMPLE  MSNRE  ( etot  ) 


K 

A 

10-2 

10-3 

7 

o 

H 

10~s 

10-5 

10~2 

2.3249X10"2 

3.5643X10"2 

3.8492X10'2 

3.8735X10-2 

4.2865X10-2 

10~l 

1.7888X10-2 

1.4680X10'2 

1.3439X10-2 

1.2261X10'2 

1.3380X10"2 

1 

6.9270X10-3 

4.4699X10-3 

3.1831X10'3 

2.7978X10-3 

3.4439X10-3 

Table  4.7.  NORMALIZED  SAMPLE  MSNRE  (  A  5000  X  etot  ) 

FOR  (A ,  K .  <r2)  6  O 


K 

A 

10  2 

7 

o 

H 

T 

o 

yH 

1 

O 

yH 

T 

o 

H 

10~2 

1.1624X102 

1.7821X102 

1.9246X102 

1.9367X102 

2.1433X102 

10'1 

8.9438X101 

7.3401X101 

6.7194X101 

6.1306X101 

6.6898X101 

1 

3.4635X101 

2.2350X101 

1.5916X101 

1.3989X101 

1.7220X101 
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In  summary  then,  we  see  that  the  proposed  global  decision-directed  algorithm  does  in  fact 
yield  a  global  estimator  of  the  parameters  which  performs  very  well  for  all  parameter  vectors  in 
the  parameter  set  of  interest,  even  for  moderate  sample  sizes.  In  view  of  this  performance,  this 
algorithm  provides  an  effective  recursive  estimator  of  the  Class  A  model  parameters. 

4.5.  Conclusions 

In  this  chapter,  we  have  developed  a  global  recursive  estimator  of  the  parameters  of  the 
strictly  canonical  Class  A  model  which  performs  very  well  for  all  parameter  vectors  in  the  param¬ 
eter  set  of  interest.  The  starting  point  in  the  study  was  the  development  of  the  so-called  BDD  algo¬ 
rithm.  This  basic,  decision-directed,  adaptive  scheme  is  physic  motivated,  easy  to  implement, 
and  is  a  recursive  version  of  a  batch  procedure  which  was  seen  in  our  earlier  work  to  provide  good 
estimates  of  the  parameters.  Unfortunately,  examination  of  the  performance  of  this  algorithm  via 
simulation  reveals  two  inherent  drawbacks  of  the  scheme  that  adversely  affect  its  performance 
even  in  a  local  setting.  However,  it  is  seen  that  by  placing  certain  restrictions  on  the  form  of  the 
initiation  vector  for  the  algorithm  and  by  incorporating  the  appropriate  modifications  into  its 
framework,  the  ensuing  difficulties  associated  with  the  two  basic  shortcomings  can  be  eliminated, 
and  an  improvement  in  the  performance  of  the  algorithm  can  be  attained  globally.  Examination  of 
the  performance  characteristics  of  the  modified  algorithm  via  simulation  indicates  that  this  algo¬ 
rithm  does,  in  fact,  yield  an  effective  global  recursive  estimator  of  the  Class  A  model  parameters. 
Although  this  final  algorithm  is  somewhat  complex,  the  payoff  for  this  complexity  is  excellent  glo¬ 
bal  performance. 
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5.  EFFICIENT  ESTIMATION  FOR  SMALL  SAMPLE  SIZES  :  THE  EM  ALGORITHM 

5.1.  Introduction 

In  the  previous  chapters,  we  have  obtained  several  batch  and  recursive  estimators  of  the  Class 
A  parameters  that  yield  good  estimates  of  the  parameters  for  moderate  sample  sizes.  In  this 
chapter,  we  will  focus  our  attention  on  the  problem  of  obtaining  a  batch  estimator  of  these  parame¬ 
ters  with  good  j-mu/i-sample-size  performance  for  all  parameter  vectors  in  the  parameter  set  of 
interest. 

One  estimator  that  has  the  potential  of  providing  estimates  of  the  Class  A  parameters  with 
the  above-mentioned  property  is  the  EM  algorithm.  This  algorithm,  which  is  ideally  suited  for 
estimation  problems  where  the  observations  can  be  viewed  as  "incomplete  data,"  was  popularized 
by  Dempster,  Laird,  and  Rubin  in  1977  [17].  We  begin  this  chapter  with  a  description  of  this  algo¬ 
rithm.  We  then  examine  the  behavior  of  the  EM  algorithm  within  the  Class  A  framework.  In  par¬ 
ticular.  for  the  single-parameter  estimation  problem,  a  closed-form  expression  for  the  estimator  is 
obtained  first.  Several  properties  of  the  estimator  are  also  derived.  Using  these  properties,  it  is 
shown  that  the  sequence  of  estimates  obtained  via  the  EM  algorithm  converges  and  a  characteriza¬ 
tion  of  the  nature  of  the  point  to  which  the  sequence  converges  is  given.  An  implementation  of  the 
estimator  based  on  the  execution  of  two  EM  algorithms  in  parallel  is  then  described.  Using  this 
implementation,  the  small-sample-size  performance  of  the  proposed  estimator  is  assessed  via  an 
extensive  simulation  study.  The  results  of  this  study  indicate  that  the  proposed  EM  estimator 
yields  excellent  estimates  of  the  parameter  for  small  sample  sizes. 

The  two-parameter  estimation  problem  is  then  examined.  For  the  two-parameter  estimation 
problem,  a  description  of  the  procedure  through  which  estimates  of  the  parameter  are  obtained  is 
given  first.  Furthermore,  using  an  implementation  analogous  to  the  one  used  for  the  single¬ 
parameter  estimation  problem,  the  small-sample-size  performance  of  the  proposed  EM  estimator  is 
also  assessed  via  an  extensive  simulation  study.  Again,  as  for  the  single-parameter  estimation 
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problem,  this  study  reveals  that  the  EM  algorithm  yields  an  excellent  estimator  of  the  Class  A 
parameters  for  small  sample  sizes. 

5.2.  The  EM  Algorithm 

A  commonly-used  two-step  iterative  technique  for  estimating  the  parameters  of  a  density 
when  the  observations  can  be  viewed  as  "incomplete-data"  is  the  EM  algorithm  [17].  Mixture  den¬ 
sities.  such  as  the  Class  A  density,  can  be  placed  naturally  in  this  framework.  In  particular,  for 
such  densities,  the  "incomplete-data"  set  is  the  set  of  observations,  whereas  each  element  of  the 
"complete-data"  set  can  be  defined  to  be  a  two-component  vector  consisting  of  an  observation  and 
an  indicator  specifying  which  component  of  the  mixture  occurred  during  that  observation.  Instead 
of  using  the  traditional  incomplete-data  density  in  the  estimation  process,  the  EM  algorithm  uses 
the  properties  of  the  complete-data  density.  In  so  doing,  it  can  often  make  the  estimation  problem 
more  tractable  and  also  yield  good  estimates  of  the  parameters  for  small  sample  sizes. 

Let  0.  denote  the  parameter  vector  to  be  estimated.  §[? )  the  estimate  of  9.  obtained  at  the  p  -th 
iteration  of  the  algorithm.  z_  the  incomplete-data  set  (set  of  observations),.*  the  complete-data  set, 
and  g  the  likelihood  function  associated  with  x_ .  The  two  steps,  the  expectation  step  (E-step)  and 
maximization  step  (M-step).  of  the  EM  algorithm  are  then  given  as  follows  : 

E-step:  Evaluate  2  (j^iL(?))  [  logg  (x.  |  JL)  | 

M-step:  Determine  0.=  6^ +1)  to  maximize  Q(9. |.(L(p^). 

The  EM  algorithm  can  be  viewed  as  an  alternative  to  maximizing  the  function  g  over  9.  In  partic¬ 
ular.  since  g  is  unknown,  we  instead  maximize  its  expectation  given  the  available  pertinent  infor¬ 
mation.  namely  the  observed  data  and  the  current  estimate  of  the  parameters. 

Consider  the  following  basic  property  of  the  EM  algorithm  :  Let  L  denote  the  likelihood 
function  associated  with  z_,  i.e.,  the  incomplete-data  likelihood  function.  If  the  function  g  is  posi¬ 
tive  almost  everywhere  on  its  domain,  then  it  can  be  shown  via  a  simple  application  of  Jensen’s 
Inequality  that 
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M0°,+1))  >  ulp))  .  p  —  o .  (5.1) 

i.e.,  the  likelihoods  of  interest  are  monotone  increasing  in  \  Given  this  desirable  property  of  the 
EM  algorithm  and  its  estimation  potential  for  small  sample  si2es.  let  us  now  consider  the  problem 
of  estimating  the  parameters  of  the  Class  A  model  via  the  EM  algorithm. 


5.3.  Estimation  of  Class  A  Parameters 

Let  z_  4  (  z 1  .  .  •  •  .  zn)  denote  the  incomplete-data  set  consisting  of  n  i.i.d.  observations  gen¬ 
erated  from  the  Class  A  envelope  pdf  w.  with  unknown  parameter  vector  0.4  (A,  K  Y  to  be 
estimated.  With  each  observation  zi ,  we  can  associate  an  unobserved  infinite-dimensional  indicator 
vector  Vj  =  (vi}  ,j  =1.2...  ,)r ,  whose  entries  are  all  zero  except  for  one  equal  to  unity  in  the  posi¬ 
tion  corresponding  to  the  unobserved  component  of  the  Class  A  envelope  mixture  density  associ¬ 
ated  with  Z; .  Thus,  let  x_  4  {(Zj  ;  i  =1 . n  }  denote  the  complete-data  set  and  g  denote  the 

likelihood  function  associated  with*..  Under  the  assumption  that  v1 . vn  are  i.i.d.  and  that  the 

z,  given  vt  are  conditionally  independent  and  identically  distributed,  we  have  that 

n  oo 

g  (ii  i) = n  n  (a  )v"  hj  cz, :  i  )v,; . 

(=1  7=1 


where 


and 


VJ 


(A)= 


e~A  A  J~l 
0  -1)! 


hj  (z,- .  0.)  =  2  z(. 


A  +  K 
j-l+K 


2  A  +X 
Z>  )  -1+X 


The  corresponding  logarithm  is  given  by 

n  n 

L0 (0.)  =  £  <vi.v(0)>  +  £  <v,.t/,(£.)>  (5.2) 

i=i  i=i 

where  V(0.)  and  Ui  (  0.)  are  infinite-dimensional  vectors  with  the  j  — th  component  In  tt  ■  and 
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In  hj  (z{ ;  0_),  respectively,  and  is  the  standard  l2-inner  product.  Let  (£ p  ^4  04  ,  *  )T 

(Ap  >  0  ,  Kp  >  O'),  p  =  0, 1 - -  denote  the  estimate  of  0.  obtained  at  the  p  — th  iteration  of  the  EM 

algorithm.  Then,  using  (5.2)  and  the  definition  of  the  Q  function  given  in  the  E-step  of  the  EM 
algorithm,  we  have  that 

n  n 

G(I|1(^)=  Z<ai.V(i)>  +  £<a,.,  £/,(!)>  .  (5.3) 

i  =1  i  =1 


where 

4^(vi|z.0(^))  =  ^(vi|zi.eO,)) 


(5.4) 


and  the  latter  equality  fellows  from  the  independence  assumptions  stated  above.  Let  a 
0  —  1.2, . . .)  denote  the  j  —  th  component  of  at .  Then  it  follows  from  (5.4)  and  the  definition  of 
Vj  that 

7rJCAp)hJ(zi:e(/,)) 

C,J  °°  ’  (55) 

L? Tj(Ap)hj(zi:()tp'’) 

j=i 


Now.  the  second  step  of  the  EM  algorithm,  the  M-step,  requires  that  we  determine  the  argument 
0_—  (A  ,K)r  that  maximizes  Q  (  JL|  JL  P To  make  the  dependence  on  A  and  *  more  explicit,  we 
can  use  the  definitions  of  V ,  i/; .  Vj  ,  and  hj  to  rewrite  the  expression  given  in  (5.3)  in  terms  of  A 
and  *  as  follows: 


n  oo 

(2(i|l(/,))=  £  £  ai}  [-A  +0  -l)ln A  -In (y  -1)!] 


i=lj=  1 

n  oo 

+  LL*ij 

•  =U  =1 


(A  +  *)  2  (A  +  *) 

In  2  z,  +  In - Z; - 

0-i  +  *)  (/-!  +  *) 


(5.6) 


Using  the  fact  that  £a^  =  1  and  eliminating  all  terms  in  (5.6)  that  are  constants  with  respect  to 
)=  i 

the  maximization,  we  obtain  the  following  objective  function  Q^j  (JL|  JL(/I '): 
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<2o*;  (JLl  JL(,°  =  A  +  a  In  A  +  n  In  (A  +  K) 

/l  oo  n  co  q 

-Z  I  au  lnO-i+^r)  -  (a  +jsr)  Z  Z  *i2 - - — 

i  =1  j  =1  i  =i  y  =i  Cy— H--K") 


(5.7) 


where  a  4  Z  Z  0  ~  1)  aij  ■  The  M-step  then  yields  as  the  estimate  of  the  true  parameter  vector  at 
i=i J =i 

the  p  +l-st  iteration  of  the  algorithm  a  maximizing  argument  of  our  objective  function,  i.e.. 

I0, +1)  =  argmax(2oA/Ul(f,)).  (5.8) 

With  regard  to  this  maximization,  let  us  address  the  single-parameter  estimation  problem  first. 


5.4.  Single-Parameter  Estimation  Problem 

Consider  the  single-parameter  estimation  problem  wherein  the  parameter  A  is  unknown  and 
the  parameter  K  is  known.  Let  Qx  denote  the  objective  function  to  be  maximized  in  the  M-step  of 
the  EM  algorithm  for  this  single-parameter  estimation  problem.  Then,  upon  fixing  Kp  to  its  known 
value  K  in  (5.7).  and  thereafter  eliminating  all  terms  in  (5.7)  that  are  constants  with  respect  to  the 
one-dimensional  maximization,  we  obtain  the  following  expression  for  QK  : 

Qx(A\Ap)  -  —  n  A  +/3  1n  A  +  n  In  (A  +K)-<f>A,  (5.9) 

n  oo  n  oo 

where  0  4  Z  Z  0  ~  1)  b,j  ■  4>  4  Z  Z  z?  /(/“■ 1+-S- )).  and  biJ&aiJ\K^=x  .  Now.  recall  from 

i  “1  y  — 1  i  =1  y  =1 

the  definition  of  the  parameter  set  A  of  interest,  that  A  takes  on  values  in  the  interval 

2  4 

A  [10  *  lj.  Define  A.a  to  be  an  extension  of  this  interval: 

A*  4  [A  |  (10~2/(l+e))  A  <  1+e}.  e  >  0,  (5-10) 


where  €  is  chosen  arbitrarily1,  and  let  Aa  be  the  compact  set  over  which  the  maximization  is  per¬ 
formed  in  the  M-step  of  the  EM  algorithm,  i.e.. 


lWe  consider  an  €-extension  of  Aa  instead  of  A^  itself  since,  for  the  boundary  points  10  2  and  1  of  A  t  .  it  is  only  rea¬ 
sonable  that  we  admit  estimates  of  these  parameter  values  that  lie  in  a  neighborhood  of  the  true  values  (see  (5.14)). 
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AP  +1  =  arS  ma*  Qg  (-a  I AP  )• 


At  A* 


A  0  €  . 


(5.11) 


This  maximization  problem  can  be  readily  solved  by  noting  the  following  property: 

Basic  Property  :For  each  p  €  0, 1 . Qx  (A  |  Ap  )  is  concave  in  A  on  (0.  oo)  (for  all  K  >  0). 

Proof:  From  (5.9),  we  have  that 

3  <2*U|a,)= - -  — 


&4' 


A2  (A+JiT)2 


Since  Ap  €  Aa  for  all  p .  it  follows  that  Ap  >  0  for  all  p .  which  implies  from  the  definition  of  0 
that  /3  >  0.  Thus, 

.2 


Qk(.A\A)<0  for  all  p  . 


□ 


It  follows  from  this  basic  property  that,  if  QK  attains  its  maximum  at  an  interior  point  of  (0,oo), 
then  a  necessary  and  sufficient  condition  for  A  =  A  max  to  be  the  maximizing  root  on  this  interval  is 
that 

-^Qk(a\ap')\a  =Amu  =  °  ■  (5.12) 

Evaluation  of  this  gradient  equation  yields  the  following  quadratic  equation  : 

atA2  +  a2A  +«3\A=Aatx=0  . 

where  a  t  —  n  —  <f>  .a  2:=  a  +  0  +  n  ,  and  a3  :=  0  K .  Note  that  av  a2,  and  a3  are  functions 
only  of  z_,  Ap  .and  K .  Now,  it  follows  from  the  definition  of  <f>  that  4>  is  nonnegative,  which 
implies  that  a  x  <  0.  Furthermore,  since  0  >  0  (see  proof  of  Basic  Property)  and  K  >  0  ,  it  also 
follows  that  a  j  >  0.  Given  that  a  t  and  a3  are  negative  and  positive,  respectively,  and  the  require¬ 
ment  that  A  be  positive,  one  of  the  roots  to  the  quadratic  equation  can  be  disregarded 
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immediately,  and  we  obtain  the  following  closed-form  expression  for  A  miX  in  terms  of  the  observa¬ 
tions,  the  known  parameter  K ,  and  Ap  : 


A  JiaX 


I  i.  A,  ,  if)  +  [  a22  (z.,  A,  ,  K)  -  4  a  ,(i .  Ap  .  if)  a3(  z_.  Ap  ,  K  )  ] 

-2a1(z..A.,.i:) 


1/2 


(5.13) 


Since  A  mtx  maximizes  Qs  on  (0,  oo),  it  follows  that  if  A  max  lies  in  Aa  ,  then  A  max  maximizes  QK  on 

*  * 

Aa.  However,  if  A  mix  lies  outside  of  Aa,  then  it  follows  from  the  Basic  Property  of  QK  that  the 

*  * 

argument  which  maximizes  QK  on  Aa  is  simply  the  boundary  point  of  nearest  A  max.  In  sum¬ 
mary  then,  the  solution  to  (5.11)  is  given  as  follows: 


10 


-2 


10 


-2 


if 

A  max  < 

1+6 

l+€ 

* 

Ap+i  - 

A  max 

if 

A  max  ^  Aa 

1+  € 

if 

A  max  >  14"  6 

(5.14) 


* 

A  0  6  .  The  estimator  described  by  recursion  (5.14)  will  be  referred  to  in  the  sequel  as  the  EM 

estimator  of  A. 


5.4.1.  Properties  of  EM  estimator  of  A 

Let  (0.  oo)  be  the  domain  of  QK  ( K  >  0). 
exist  for  all  A  €  (  0,  oo). 


Note  that  - Qx  (A  I  A.  )  and 

QA  P 


77  Or  IV 

dA 


Property  1:  For  each  p  €  {0, 1 . },  there  exists  X^,  >  0  such  that 

Q*(Ap+l\Ap)-QxCA  j  Ap)Z\p(Ap+1-Ap)2  . 

Proof.  The  property  clearly  holds  if  Ap  =Ap+1.  Suppose  Ap  Ap+l .  Expand  ^(AjA^,)  in  a 
Taylor  series  about  Ap  +l  to  obtain 
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<2x(A\Ap)  =  Qk  (Ap+1  |  Ap  )  +  (A  -Ap+l)  —  Qk  (A,  +1 1  A,  ) 

,2  .  ^ 

+  w  —  SjtW'rtM,) 

5A 


(5.15) 


*  * 

for  some  Ap+1  satisfying  min(  Ap  .  AB  +1)  <  Aa  +1  <  max  (  Ap  .  Ap  +1).  Evaluation  of  (5.15)  at 


A  —  Ap  yields  : 


Qk  (Ap  |  Ap  )  -  Qk  ( Ap  +i  |  Ap  )  +  (  Ap  ~Ap  +1)  Qx  ( Ap  +1  [  Ap  ) 

.2  ,  SA 

QA 


(5.16) 


Now.  Ap  +1  can  take  on  one  of  three  values  :  (10  2/(l  +  e)),  A  mtx  ,  or  1  +  €  (see  (5.14)). 

6  9 

(i)  Suppose  Ap  +i  =  A  m„  .  Then  - Qs  (Ap  +l  |  Ap  )  = - QK  (A  max  |  A  )  =  0.  where  the  latter 

0A  QA 

inequality  follows  from  the  definition  of  Amax.  Consequently. 

(Ap  —  Ap  +1)  —■  Qk  (Ap  +i  |  Ap  )  =  0. 
oA 


_2  t 

(ii)  Suppose  Ap+1  =  (10  /(1+e)).  Then,  Ap+1  <  Ap  (since,  of  course,  A0  6  Aa  also).  Also  note 


V  c  •‘-24 


that  Ap+1  =  (10  /(1  +  e))  only  when  A  m„  <  (10  2/(l  +  e)),  i.e.,  Ap+1^Amax.  Since 


Ap  +i  ^  -A  m»x  and  - (A  ( A  )  =  0,  it  follows  from  the  Basic  Property  of  QK  that 

QA 

9  .9 

- Qk  (Ap  +1 1  Ap  )  <  0.  Thus,  we  have  that  Ap  —  Ap  +1  >  0  and  -  Qjj-  (A  +1 1  Ap  )  ^  0, 

6A 

which  imply  that  (Ap  -  Ap  +1) - £*  Up  +1  |  Ap  )  <  0. 

QA 

(iii)  Suppose  Ap  +1  =  1  +  e.  Then  Ap  +1  ^  Ap  .  Also  note  that  Ap  +1  =  1  +  e  only  when  A  mix  ^  1  +  e, 

Q 

i-e.,  Ap  +1  ^  A  mtx.  Since  Ap  +1  <  A  mtx  and  - 2r  (A  max  |  Ap  )  =  0,  it  again  follows  from  the 

QA 


Basic  Property  of  Qx  that  —  Qx  (Ap  +1  ]  Ap  )  >  0.  Thus,  we  have  that  Ap  -  Ap  +1  <  0  and 


d  3 

—  Qk  (Ap  +i  I  Ap  )  >  0.  which  imply  that  (Ap  -  Ap  +1) - QK  (Ap  +1  j  Ap  )  <  0. 

0A  QA 
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In  all  three  cases,  (A,  —  Ap  +1) - Qs  (A,  +1 1 A  )  ^  0,  which  together  with  (5.15)  implies  that 

34 

.2 

Qs  u,  +1 1-4,)- (A,  |  A,  )  >  -  (,Ap  +1  -  A,  f  (2*  CA*+l  |A,  )• 

34 


u  /  I  \ 

Moreover,  from  the  concavity  of  QK  .  we  have  that - —  QK  (A,  +1  j  A  )  <  0.  Thus,  by  taking 


34' 


~  2  Qs  ^4, +i  |  4,  ). 


34' 


the  statement  of  the  property  follows. 


□ 


The  following  property  follows  straightforwardly  from  Property  1: 

Property  2:  For  each  K  >  0.  there  exists  a  scalar  X  >  0  such  that 

Qs  (4,  +1 1 4,  )  -  <2,  (A,  1 4,  )  >  X  (A,  +1  -  4,  f  for  all  p  6  {0. 1. . . .}. 

Proof  We  have  shown  in  Property  1  that  for  each  p  G  {0. 1. . . }  .  there  exists  X,  >  0  such  that 

Qs  (4,  +i  1 4,  )  -  Qk  (A,  1 4,  )  £  X,  (4,  +1  -Apf  .  (5. 17) 

where 

a2  % 

\  —  ~~r  Qs  (4,  +l  1 4,  )  (5.18) 

34 

when  4,  ^  4,  +1  and  min(  Ap  .  A,  +1)  <  A,  +1  <  max  (  A,  ,  A,  +1).  Equations  (5.9)  and  (5.18)  yield 
the  following  expression  for  X,  : 

@p  n 

\  =  — - +  — 1 -  • 

(A,+1)2  (A,  +1  +K)2 

where  the  dependence  of  3  o n  P  is  explicitly  shown  here.  Since  0p  >  0  for  each  p  ,  it  follows  that 
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x„  > - 

p  * 


n 


(ap+  ,  +  *r 


*  * 

Furthermore,  since  min  (  .  Ap  +1)  <  Ap  41  <  max  (  Ap  .  Ap  +1)  and  Ap  ,  Ap  +I  €  .  we  have  that 

AP+i  €  A*  .  i.e.,  Ap+l  <l+e.  Thus. 


> 


n 


(1  +  e  +  K) 


for  all  p  . 


which  implies  that 


X&  inf  X^,  ^ 


(l  +  e  +  if)z 


>  0  . 


(5.19) 


It  then  readily  follows  from  (5.17)  and  (5.19)  that 


CrUp+il^O-GtfOUA,)  ^  X(il.+i-.A,)2  for  all  p 


Thus,  we  have  the  desired  result. 


□ 


We  are  now  in  a  position  to  prove  the  following  convergence  theorem.  Let  LK  denote  the 
incomplete-data  log-likelihood  function  for  known  K  (if  >0),  i.e..  the  traditional  log-likelihood 
function  for  the  single-parameter  estimation  problem  : 


Lk(A)=  £  In  £  TTjCA^hjCz^.A.m 

i=l  J =1 


=  ^  £  le^A'-1  A+K 

itl  J=1  0-1)1  j  — 1+if 


.  ^  €A^ 


2  A+K 
z‘  1  -t+X 

zi  e 


A  6  A* 


(5.20) 


Note  that  LK  is  continuous  on  Aa  and  differentiable  in  the  interior  of  Aa  .  Since  LK  is  a  continu¬ 
ous  function  on  the  compact  set  A^ .  it  follows  that  LK  is  bounded.  Furthermore,  from  (5.1).  we 

have  that  LK  (Ap  +1)  ^  Ls  (Ap  )  for  all  p  €{0.  1.  . . .}.  Thus,  since  [LK{Ap))  is  a  monotone 

* 

bounded  sequence.  \LK(.Ap))  converges  monotonically  to  some  L  .  Now.  let 


L  (|3)  4  (A  €  Aa  :  Lz  (A  )  =  0}.  The  statement  of  the  theorem  is  then  given  as  follows: 

*  *  * 

Convergence  Theorem  :  Let  LK  and  L  be  defined  as  above.  Then  Ap  A  €  Aa  and 

*  *  *  *  _ 

LK  (A  )-*  L  -  Lk(A  ).  Furthermore,  if  A  is  an  interior  point  of  Aa  .  then -  —  0  . 

8A  a=a 

i.e..  A  is  a  stationary  point  of  Ls  ,  the  likelihood  function  of  interest. 

*  *  *  * 

Proof:  It  is  evident  that  if  Ap  -*  A  ,  then  L  =  LK  (A  ).  Now.  in  order  to  show  that  Ap~*  A  ,  it 

suffices  to  show  that 
0)  lAp+i-A,  |  —  0  asp  -*  oo  ,  and 
(ii)  L{L  )  is  discrete. 

That  these  two  conditions  are  sufficient  for  the  convergence  of  {Ap  }  follows  from  Theorem  6  of 
Wu  [18]. 

Consider  condition  (i).  We  have  shown  in  Property  2  that  there  exists  a  scalar  X  >  0  such 

that 

<2x  CAP  +1 1  A,  )  -  e*  (Ap  I  Ap  )  £  X  (Ap  +x  -  Ap  f  for  all  p  6  {0. 1. . . .}.  (5.21) 

From  the  definitions  of  LK  and  Qx  ,  it  can  also  be  readily  shown  via  a  simple  application  of 
Jensen's  Inequality  that 

Lk  (Ap  +1)  -  Ls  {Ap  )  >  Qk  iAp  +i  |  Ap)-QK  (A,  |  Ap  )  for  all  p  6  {0. 1. . . .}  (5.22) 

(see  proof  of  Theorem  1  of  Dempster,  Laird,  and  Rubin  [17]).  Thus,  it  follows  from  (5.21)  and 
(5.22)  that 

LKUp^)-LKUp)^UAp^-Apf  for  all  p€  {0.1....}.  (5.23) 

Since  \LK  (Ap  )}  converges,  we  have  that  j  LK  (Ap  +1)  —  LK  (Ap  )  ]  =  LK  {Ap  +1)  —  LK  (Ap  )  -*  0  as 
p  oo  .  where  the  first  equality  follows  from  the  monotinicity  property  of  [Lk  (Ap  )}.  It  then  fol¬ 
lows  from  (5.23)  that  |  Ap  +1  —  Ap  (  -♦  (  as  p  -*<x>  and  the  verification  of  condition  (i)  is  complete 
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*  * 

Consider  condition  (ii).  We  need  to  show  that  {A  €  :  LK  (A  )  —  L  =  0}  is  discrete.  It  can 

*  * 

be  readily  shown  that  LK{A)  —  L  is  analytic  in  .  Since  it  is  well  known  that  the  zeroes  of  an 
analytic  function  (which  is  not  identically  to  zero)  are  isolated,  the  result  follows. 

The  verification  of  conditions  (i)  and  (ii)  is  now  complete.  Since  Ap  —  A  in  the  closure  of 

*  *  *  *  * 

Aa  and  A^  is  a  closed  set,  it  follows  that  A  €  A^.  We  must  now  show  that  if  A  is  an  interior 

*  * 

point  of  Aa  ,  then  A  is  a  stationary  point  of  Lx .  Again,  from  Theorem  6  of  Wu  [18],  it  suffices  to 
show  that 


3 

(iii)  - Qk  ( Ap  +i  |  Ap  )  =  0  for  all  p  sufficiently  large,  and 

9  *  * 

(iv)  - Qk  {A  |  Ap  )  is  continuous  in  (A  ,  Ap  )  on  A^  XAa  . 

QA 


Consider  condition  (iii).  If  A  is  an  interior  point  of  Aa  ,  then  there  exists  some  p'  such  that 
Ap+l  is  an  interior  point  of  A^  for  all  p  ^  p".  Thus,  it  follows  from  (5.14)  that  A/>+1  =  Amax  for 

3 

all  p  ^  p‘ .  (Of  course,  Amax  varies  with  p  .)  Since  - QK  (A  m4X  |  A_  )=  0  by  definition  of  Amax, 

3A 


we  have  that - QK  ( Ap  +j  |  A  )=  0  for  all  p  ^  p' .  The  verification  of  the  condition  is  complete. 


Consider  condition  (iv).  Note  that  the  infinite  series  in  the  expression  for  QK  (A  ]  Ap  )  given  in 

*  * 

(5.9)  are  uniformly  convergent  series  on  Aa  XAa.  Furthermore,  each  term  of  a  given  series  is  a 

*  * 

continuous  function  on  A^  XA^.  It  follows  then  that  each  series  is  a  continuous  function  on  this 

9  . 

domain.  Thus,  the  expression  for  - Qx  (A  |  Ap  )  consists  of  sums  and  ratios  of  continuous  func- 


•  * 

tions  on  Aa  X  Aa  .  The  result  follows. 

*  * 

Since  conditions  (iii)  and  (iv)  hold,  it  follows  that  if  A  is  an  interior  point  of  A^ .  then  A 
is  a  stationary  point  of  LK  .  The  proof  of  the  theorem  is  now  complete.  □ 


82 


The  particular  stationary  point  to  which  [Ap  }  converges  is  dependent  upon  the  initiating  point 
A  0.  Thus,  given  a  "sufficiently  rich"  set  of  initiating  points,  then  upon  executing  several  EM  esti¬ 
mators  of  A  in  parallel,  each  initiated  with  a  different  point  from  this  set,  all  of  the  stationary 
points  corresponding  to  relative  maxima  of  the  (traditional)  likelihood  function  can  be  located,  and 
hence,  the  point(s)  corresponding  to  the  absolute  maximum.  With  this  in  mind,  consider  the  fol¬ 
lowing  implementation  of  the  EM  estimator  of  A  . 

5.4.2.  Implementation  of  EM  estimator  of  A 

* 

First,  choose  e  —  0.1  in  the  definition  of  (given  in  (5.10)).  Secondly,  let  tl  set  of  initiat¬ 
ing  points,  A*k.  consist  of  the  left  boundary  point  of  Aa  and  the  (logarithmic)  mean  of  the  inter¬ 
val  defined  by  A^  ,  i.e..  A*u  =  {9.09  X  10-3 , 10-1}.  (That  the  two  points  in  A*u  form  a  "sufficiently 
rich"  set  for  the  determination  of  the  point(s)  corresponding  to  the  absolute  maximum  of  the  (trad¬ 
itional)  likelihood  function  was  verified  experimentally.)  For  each  true  parameter  A  ,  two  EM  esti¬ 
mators  of  A  were  executed  in  parallel,  one  initiated  with  9.09  X  10  3,  the  other  with  10  \  Let 
{A^l  and  iAp2).  p  —  0,  1 . denote  the  sequences  of  estimates  obtained  via  these  two  EM  estima¬ 

tors  (A  q1  =  9.09  X  10-3  and  A  2  =  10-1).  Both  EM  estimators  "converged"  in  the  sense  that,  for  each 
estimator,  the  magnitude  of  the  relative  difference  in  the  values  of  the  estimates  on  successive 
iterations  decreased  as  the  number  of  iterations  increased.  For  each  of  the  two  sequences  of  esti¬ 
mates  {Ap},  i— 1,2,  the  so-called  convergence  iteration  value  was  then  taken  to  be  the  minimum 
iteration  value  l  Q.  Z  1)  for  which  [((A/  —  A{_x  )/A,')|  <  1 0~7.  Let  p*  and  p**  denote  the  conver¬ 
gence  iteration  values  for  {Apl\  and  {Ap},  respectively.  Then,  of  the  two  "limits"  A*,  and  A2„  . 

a 

the  estimate  A  of  the  true  parameter  A  was  chosen  to  be  that  limit  which  maximized  the  tradi¬ 
tional  incomplete-data  likelihood  function,  i.e., 

A  4  arg  max  LK  (A  ). 

ACM1.  A  *  I 

Let  us  now  examine  the  performance  of  this  estimator. 
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5.4.3.  Simulation  results 

An  extensive  simulation  study  of  the  proposed  estimator  wa s  performed  for  a  wide  range  of 
parameter  vectors  (A  JC )  in  A,  in  particular,  for  all  (A ,  K )  6  fl  .  where 
f 1  A  {(A  JC  )r  €A  |  log  A  €  Z  and  log  K  6  Z  }.  For  each  (A  JS.  )  pair,  if  was  fixed  to  its  true  value 
and  estimates  of  the  paramter  A  were  obtained  using  the  implementation  of  the  EM  estimator  of 
A  described  above.  Using  100  data  sets,  each  containing  100  samples  generated  from  the  Class  A 
envelope  pdf.  the  sample  mean-square  relative  error  was  first  computed.  Let  pA  denote  this  quan¬ 
tity.  Then 

.  2 
Aj  —A 

A 

where  A}  denotes  the  estimate  of  A  obtained  using  the  j  — th  data  set.  The  values  for  pA  are  tabu¬ 
lated  in  Table  5.1.  Note  from  this  table  that  the  values  for  the  sample  mean-square  relative  error 
are  extremely  low  for  all  (A  J5T )  pairs  under  consideration  (on  the  order  of  10  3  to  10  2).  The 
values  of  the  normalized  sample  mean-square  error  (NSMSE),  which  is  defined  to  be  the  product  of 
the  sample  size  (100)  and  the  sample  mean-square  relative  error,  were  then  computed  (see  Table 
5.2)  and  compared  to  the  Cramer-Rao  Lower  Bound  (CRLB).  (The  values  for  the  CRLB  are  given 
in  Table  5.3.)  Note  that  the  values  for  the  NSMSE  are  very  close  to  the  CRLB  for  all  (A  JC  )  pairs. 
The  reason  that  the  values  for  the  NSMSE  are  sometimes  lower  than  the  CRLB  is  due  to  the  fact 
that  the  estimator  is  slightly  biased,  as  is  evident  from  the  values  of  the  sample  relative  bias,  r\A  , 

Aj  -  A 
A 


Va& 


100 


too 

z 

;=1 


100 


ft>4— I 
100 


tabulated  in  Table  5.4.  Now,  it  was  noted  in  the  implementation  described  above  that  the  estima¬ 
tion  process  involved  executing  two  EM  estimators  of  A  and  each  "converged"  within  a  certain 

number  of  iterations,  namely  p*  and  pj*  .  (The  association  with  the  j  — th  data  set  is  made  explicit 
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Table  5 . 1 .  SAMPLE  MEAN-SQUARE  RELATIVE  ERROR  (  pA  ) 

(100  SAMPLES.  100  RUNS) 


K 

A 

10  2 

10"3 

10-4 

10'5 

f 

o 

H 

10~2 

3.2532  X  10-2 

1.0449  X  10~2 

8.7841  X  10-3 

8.6326  X  10-3 

8.6176  X  10"3 

10"1 

1.3591  X  10-2 

1.C324  X  )0~2 

9.4479  X  10“3 

9.2596  X  10-3 

9.2451  X  10"3 

1 

1.7950  X  10"2 

6.4732  X  10'3 

6.8095  X  10-3 

6.5755  X  10-3 

6.6118  X  10-3 

Table  5.2.  NORMALIZED  SAMPLE  MEAN-SQUARE  ERROR 
(100  SAMPLES.  100  RUNS) 


K 

A 

10-2 

10~3 

10~^ 

10~s 

10-6 

10  ~2 

3.2532 

1.0449 

8.7841  X  10-1 

8.6326  X  10_1 

8.6176  X  10-1 

i 

© 

r-i 

1.3591 

1.0324 

9.4479  X  10_1 

9.2596  X  10'1 

9.2451  X  10_1 

■ 

1.7950 

6.4732  X  10_1 

6.8095  X  10-1 

6.5755  X  10'1 

6.6118  X  10_1 

Table  5.3.  CRAMER-RAO  LOWER  BOUND  FOR  ESTIMATE  OF  A 


K 

A 

10~2 

10-3 

10~“ 

© 

A 

10-6 

10'2 

4.0291 

1.2041 

1.0112 

9.9246  X  10"1 

9.9055  X  10_1 

10-1 

1.3054 

9.7417  X  10-1 

9.3217  X  10-1 

9.2600  X  10"1 

9.2509  X  10_l 

1 

1.3227 

9.6827  X  10_1 

9.0303  X  10'1 

8.9045  X  10_1 

8.8821  X  10"1 
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Table  5.4.  SAMPLE  RELATIVE  BIAS  (t)a  ) 
(100  SAMPLES.  100  RUNS) 


K 

A 

10"2 

io-3 

o 

1 

10-5 

10"* 

10-2 

1.3683  X  10-1 

8.1403  X  10~2 

7.4757  X  10”2 

7.4103  X  10  *2 

MHHM 

10"1 

9.1323  X  10~2 

8.0240  X  10~2 

7.6436  X  10'2 

7.5664  X  10~2 

7.5585  X  10~2 

1 

8.4126  X10'2 

6.8618  X  10'2 

7.1043  X  10"2 

6.8923  X  10~2 

6.9436  X  10'2 

Table  5.5.  WORST-CASE  CONVERGENCE  ITERATION  VALUE 

(100  SAMPLES,  100  RUNS) 


K 

A 

10"2 

lr  3 

T 

© 

10~5 

10“* 

10~2 

7.88 

6.59 

5.32 

4.85 

4.41 

io‘l 

11.18 

6.61 

5.47 

5.24 

5.21 

1 

27.99 

20.02 

18.75 

17.45 

16.90 
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here.)  Let  pmlx  denote  the  sample  worst-case  convergence  iteration  value,  i.e., 

j  100 

P - X  ma x(pf  .pj*). 

iooJ=l 

The  values  for  p  mtx  are  tabulated  in  Table  5.5.  Note  that  the  values  for  this  sample  worst-case 
convergence  iteration  value  are  extremely  low  for  all  parameters  under  consideration  (on  the  order 
of  10). 

In  summary,  we  see  then  that  for  a  sample  size  of  100,  the  proposed  EM  estimator  of  A  per¬ 
forms  very  well  in  a  mean-square  error  sense,  in  terms  of  attaining  the  CRLB,  and  in  terms  of  rate 
of  convergence.  Consequently,  for  the  single-parameter  estimation  problem,  it  yields  an  excellent 
estimator  for  small  sample  sizes. 


5.5.  Two-Parameter  Estimation  Problem 

The  maximization  problem  described  by  (5.8)  will  now  be  considered.  As  for  the  single- 

* 

parameter  case,  define  A e  to  be  the  following  extension  of  the  parameter  set  A: 

A@4  {(A  ,K)r :  9.09xl0_3<A  <  1.1  and  9.09  x  10~7<  J5T  <  1.1X10'2}  . 


* 

ana  let  Ae  be  the  compact  set  over  which  the  maximization  will  be  performed,  i.e., 

0?*  +  1)  =  arg  max  Q^j  (  0.|  0.(/,))  ,  J0(O)€A£  . 

Consider  first  the  gradient  equations 


(5.24) 


—  Go6;(0|i°’))  =  O  (5.25a) 

bA 


and 

—  QobldO\9{p))  =  0  (5.25b) 

ex 

Upon  computing  the  partial  derivatives  on  the  left-hand  sides  of  (5.25a)  and  (5.25b).  the  gradient 
equations  become 
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a.  n  ,  uij 

■  n  +  —  + - 22  Z  z  i  -  =  0  • 

A  A+K  ,slJ=1  Cj-l+K) 


(5.26a) 


+  04  +  *)  Z  Z  h 


=  0  .  (5.26b) 


A  +  K  i=1  ;=1(;-1+*)  j=w=l  ij-l+KT 

Now  (5.26a)  simplifies  straightforwardly  to  a  quadratic  in  A  .  Using  an  argument  analogous  to  the 
one  given  in  Section  5.4,  it  can  be  shown  that  one  of  the  roots  to  this  quadratic  can  be  disregarded, 
and  the  following  closed-form  expression  for  the  parameter  A  can  be  obtained  in  terms  of  the 
parameter  K : 


n*+|r*-a-n-[(n*+£jr*-a-n)2  +  4(n  +  |jr)a*]1 

~  2  (  n  +  g s  ) 


(5.27) 


where  £s  a  £  £*i 


i-w-i  O-1+tf) 


Upon  substituting  the  expression  for  A  given  in  (5.27)  into 


(5.26b),  we  obtain  an  equation  in  K  only.  Thus,  the  two-variable  maximization  problem  has  been 
reduced  to  a  maximization  over  the  single  variable  K .  Unfortunately,  the  resulting  equation  in  K 
is  highly  nonlinear  in  this  parameter.  However,  since  the  maximization  is  performed  over  the  set 

A9  ,  K  takes  on  values  in  the  interval  [9.09  X  10~7. 1.1  X  10  2].  Thus  j—l+K  **/— 1  for  j  ^  2. 
Using  this  approximation,  the  highly  nonlinear  equation  in  K  can  be  greatly  simplified.  Moreover, 
the  square  of  the  simplified  equation  can  be  reduced  to  the  following  fourth-order  polynomial 
equation  in  K  : 


c1K*  +  c2  K3+c^K2+c4K  +  cs  =  0  , 


(5.28) 


where  the  coefficients  c;  (1  ^  i  ^5)  are  functions  only  of  the  observations  and  the  current  estimate 
Qtf >  of  the  parameters.  (The  expressions  for  these  coefficients  are  given  in  Appendix  B.)  The  roots 
of  (5.28)  can  be  readily  determined.  Consider  the  set  consisting  of  the  positive,  real  roots  of  (5.28) 
which  lie  in  the  open  interval  (8  X  10-7,  1.7  X  10  2)  and  which  satisfy  the  simplified,  unsquared 
equation  in  K.  These  roots  can  be  refined  using  Newton’s  root-finding  method  on  the  original 
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equation  initiated  with  these  roots.  For  those  refined  roots  which  take  on  values  in  the  interval 
(9.09  X  10  7.  1.1  X  10  2),2  the  corresponding  value  of  A  is  then  evaluated  using  (5.27).  If  the  value 

of  A  lies  in  (9.09  X  10  3.  l.l),  then  the  resultant  (A  .  K  )  pair  yields  a  stationary  point  in  the  inte- 

* 

rior  of  A9  .  If  the  value  of  A  does  not  lie  in  this  interval,  then  the  resultant  (A  JC  )  pair  yields  a 

* 

stationary  point  that  does  not  lie  in  the  interior  of  A9  .  and  hence,  is  eliminated  from  future  con¬ 
sideration. 

* 

The  above  procedure  yields  those  stationary  points  that  lie  in  the  interior  of  A9  .  Let  us  now 

focus  our  attention  on  the  problem  of  determining  those  stationary  points  that  lie  in  the  interior  of 

« 

each  of  the  four  intervals  which  define  the  boundary  of  A9  .  First,  consider  the  interval  defined  by 
K  =9.09x  10  7  and  A  €[9.09x  10  3.  l.l].  Clearly,  for  fixed  K.  any  stationary  point  must 
satisfy  (5.27)  evaluated  at  this  fixed  value  of  K.  Now.  upon  evaluating  (5.27)  with 
K  -  9.09  X  10~7.  if  the  resulting  value  of  A  lies  in  (9.09  X  10~3,  l.l).  then  the  stationary  point 
defined  by  this  value  of  A  and  K  =  9.09  X  10-7  is  retained  for  future  consideration;  otherwise,  it  is 
disregarded.  Now.  consider  the  interval  defined  by  K  =  1.1  X  10  2  and  A  €  [9.09  X  10  3,  l.l]. 
Again,  upon  evaluating  (5.27)  with  K  —  1.1  X  10-2,  if  the  resulting  value  of  A  lies  in 
(9.09  X  10  3, 1.1),  then  the  stationary  point  defined  by  this  value  of  A  and  K  =  1.1  X  10  2  is 
retained  for  future  consideration:  otherwise,  it  is  disregarded.  Let  us  now  determine  the  stationary 
points  in  the  interior  of  the  interval  defined  by  A  =  9.09  X  10-3  and  K  €  [9.09  X  10-7, 1.1  X  10  2]  . 
Now.  any  stationary  point  in  the  interior  of  this  interval  must  satisfy  (5.26b)  evaluated  at 
A  =  9.09  X  10  3.  Since  (5.26b)  is  highly  nonlinear  in  K ,  the  determination  of  the  roots  of  (5.26b) 
(A  =  9.09  X  10-3)  can  be  simplified  by  using  once  again  the  approximation  that  j  —  1+Jf  =  j  —1  for 
j  ^  2.  Using  this  approximation.  (5.26b)  can  be  reduced  to  the  following  fourth-order  polynomial 
equation  in  K  : 

2It  was  verified  via  simulation  that  the  interval  (8xlO”7  ,  1.2_xl0”2)  was  sufficiently  large  to  contain  those  roots  whose 
refinement  yields,  at  least,  all  roots  of  the  original  equation  in  (9.09x10  ,1.1X10” 2). 
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dlK4+d2  K3+d3K2+d4K  +  d5  =  0  .  (5.29) 

where  d1  =  fi4  ,  d2  =  2A/34  — 02  .  d2  =  $4A2  —  02A  +  /33  —  Px  +  n.  d4  =  2/33A  —  0jA 
d 5  =  /33  A2,  and  where  the  j3; ’s  (1^  i  ^4)  are  functions  only  of  the  observations  and  the  current 
estimate  of  the  parameters.  (Thf  expressions  for  these  j3j 's  are  given  in  Appendix  B.)  Consider  the 
positive,  real  roots  of  (5.29)  that  lie  in  the  interval  (8  X  10  7. 1.2  X  10  2).  These  roots  can  be 
refined  using  Newton's  root-finding  method  on  (5.26b)  (A  =  9.09  X  10-3)  initiated  with  these  roots. 
If  the  value  of  the  refined  root  lies  in  the  interval  (9.09  X  10-7. 1.1  X  10-2).  then  the  stationary 
point  defined  by  A  =  9.09  X  10-3  and  K  equal  to  the  value  of  the  refined  root  is  retained  for  future 
consideration.3  Otherwise,  it  is  disregarded.  The  stationary  points  in  the  interior  of  the  interval 
defined  by  A  =  1.1  X  10-2  and  K  €  [9.09  X  10-7. 1.1  X  10-2]  are  determined  in  a  completely  analo¬ 
gous  manner. 

* 

The  function  Q^j  is  now  evaluated  at  all  stationary  points  in  the  interior  of  Ae  ,  at  all  sta- 

* 

tionary  points  in  the  interior  of  each  of  the  four  intervals  which  define  the  boundary  of  Ae  and  at 

the  four  comer  points  of  :  (9.09  X  10-3, 1.1  x  10  2)r  ,  (9.09  x  10  3,  9.09  X  10  7)r , 

(1.1  .  1.1  X  10  2)r,  and  (1.1 . 9.09  X  10-7)r.  Of  the  points  at  which  Q^j  is  evaluated,  the  solution 
to  (5.24)  is  taken  to  be  that  point  which  maximizes  Q^j  •  For  a  given  we  will  refer  to  this 
solution  as  the  EM  estimator  of  Q_. 

5.5.1.  Implementation  of  EM  estimator  of  9. 

As  for  the  single-parameter  estimation  problem,  two  EM  estimators  of  0.  were  executed  in 

parallel,  one  initiated  with  (9.09  X  10  .  1.1  X  10  )  (the  vector  in  Ae  for  which  the  ratio  of  the 

second  coordinate  to  the  first  is  largest),  the  other  initiated  with  (10-1,  lO”4)7^  (an  interior  vector 

* 

of  A9  for  which  the  ratio  of  the  second  coordinate  to  the  first  is  the  logarithmic  mean  over  all 

JAgam,  it  was  verified  via  simulation  that  the  interval  (8xlO-7  , 1._2xl0-J)  was  sufficiently  large  to  contain  those  roots 
whose  refinement  yields,  at  least,  all  roots  of  the  original  equation  in  (9.09x10  , l.lxlO-2). 
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possible  values  that  this  ratio  can  take  on  for  parameter  vectors  in  Ae).4  Let  {0.  ^ 5}  4  {(A^,1.  } 

and  (JL  2^  ^  ^  {(A?2,  -Kp2)r )  .  p  —  0 . 1  .  ,  denote  the  sequences  of  estimates  obtained  via  these  two 

EM  estimators  (d_^0)  =  (9.09  X  10  3.  1.1  X  10  ")r  and  0_2°'>  —  (10  1,  10~^)r).  Both  EM  estimators 
"converged"  in  the  sense  that,  for  each  estimator,  the  magnitude  of  the  relative  difference  in  the 
values  of  the  estimate  (of  each  parameter)  on  successive  iterations  decreased  as  the  number  of 
iterations  increased.  For  each  of  the  two  sequences  of  estimates  {  0./^}  ,  i  =1 ,2,  the  "convergence 
iteration  value"  was  then  taken  to  be  the  minimum  iteration  value  l  (f  ^1)  for  which 
|(U/-a/_1  )M/)|  <  10-7  and  !((*/-*/_!  )/Kl)\  <  10-7.  Let  p‘  and  p "  ip'.p"  >  l)  denote  the 

convergence  iteration  values  for  {JLi^}  and  {JL^},  respectively.  Then  of  the  two  "limits"  0.  ^  ) 

and  0_2  •  tbe  estimate  0. of  the  true  parameter  vector  was  chosen  to  be  that  limit  which  maximized 

the  traditional  inccmplete-data  likelihood  function,  i.e., 

j?  =  arg  max  L  (  0.)  . 

where 

It  oo 

£(0>LlnZ>,(A)/,(zj;A..fir)  . 

;  =x  j  =i 

Let  us  examine  the  performance  of  this  estimator. 

5J5.2.  Simulation  results 

As  for  the  single-parameter  estimation  problem,  an  extensive  simulation  study  of  the  pro¬ 
posed  estimator  was  performed.  Using  100  data  sets,  each  containing  100  samples  generated  from 
the  Class  A  envelope  pdf,  the  sample  mean-square  norm  relative  error  (MSNRE)  was  first 

“It  was  observed  via  simulation  that  the  convergence  of  the  EM  estimator  to  a  vector  that  closely  approximated  the 
true  parameter  vector  was  highly  dependent  on  the  value  of  this  ratio.  In  particular,  if  the  ratio  of  the  second  coordinate  of 
the  initiating  vector  to  the  first  exceeded  (but  not  considerably)  the  corresponding  ratio  for  the  true  parameter  vector,  then 
the  EM  estimator  was  guaranteed  tc  converge  to  a  vector  that  closely  approximated  the  true  parameter  vector.  Thus,  given 
the  two  initiating  vectors,  at  least  one  of  the  EM  estimators  was  guaranteed  to  converge  to  a  vector  with  this  property.  This 
restriction  on  the  above-defined  ratio  was  also  seen  to  be  a  necessary  condition  for  convergence  of  the  BDD  algorithm  formu¬ 
lated  in  Section  4.2. 
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computed  for  each  0.  6  SI  .  (The  definition  of  SI  was  given  in  Section  5.4.3.)  Let  pe  denote  this 
quantity.  Then. 


Pe~ 


too 


100 


)=  i 


where  9_j  A  .  K, )  (1  ^  100)  denotes  the  estimate  of  0.  obtained  using  the  /  — th  data  set. 

The  values  for  p9  are  taouiatea  tn  Table  5.6.  Note  from  this  table  that  the  values  of  this  relative 
error  are  quite  low  for  all  parameter  vectors  under  consideration  (on  the  order  of  10  2  to  10  *). 
The  values  of  the  normalized  sample  mean-square  norm  relative  error  (4  n  X  MSNRE)  were  then 
computed  (see  Table  5.7)  and  compared  to  the  CRLB.  (The  values  of  the  CRLB  were  given  in  Table 
3.8.)  Note  that  the  values  of  the  normalized  MSNRE  are  either  close  to  the  CRLB  or  lower  than  the 
CRLB.  The  reason  that  the  values  for  the  normalized  MSNRE  are  sometimes  lower  than  the  CRLB 
is  due  to  the  fact  that  the  estimator  is  somewhat  biased,  as  is  evident  from  the  values  of  the  sample 
relative  bias.  7)fl. 


too 

z 

1=1 


tabulated  in  Table  5.8.  Now.  it  was  noted  r  that  the  estimation  of  0.  involved  the  execution  of 
two  EM  estimators  of  Q_  in  parallel  and  eaci  nverged"  within  a  certain  number  of  iterations, 
namely  pj  and  p”  .  (The  association  with  the  j —th  data  set  is  made  explicit  here.)  Let  />  ~ax 
denote  the  sample  worst-case  convergence  iteration  value,  i.e., 

j  too 

4  — Z  ma*^/  -pP- 

The  values  for  are  tabulated  in  Table  5.9.  As  was  the  case  for  the  single-parameter  estima¬ 
tion  problem,  note  that  the  values  for  this  sample  worst-case  convergence  iteration  value  here  are 
also  extremely  low  for  all  parameter  vectors  under  consideration  (on  the  order  of  10). 


In  summary  then,  we  see  that  for  a  sample  size  of  100,  the  proposed  EM  estimator  of  6_  per¬ 
forms  again  very  well  in  a  mean-square  error  sense,  in  terms  of  attaining  the  CRLB,  and  in  terms 
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Table  5.6.  SAMPLE  MEAN-SQUARE  NORM  RELATIVE  ERROR  (  p9  ) 

(100  SAMPLES.  100  RUNS) 


K 

A 

10  2 

10-3 

10~* 

1 

O 

▼H 

10-6 

10~2 

5.6312  X  10“2 

7.7317  X  10-1 

7.0539  X10-1 

6.9907  X  10'1 

6.8783  X  10'1 

10-1 

8.5002  X  10-2 

1.3443  X  10"1 

1.2567  X  10_1 

1.2522  X  10_1 

7.484  X  10-2 

1 

3.4455  X  10~2 

4.9856  X  10"2 

4.8199  X  10-2 

4.6672  X  10-2 

3.0990  X  10"2 

I 

I 

Table  5.7.  NORMALIZED  MEAN-SQUARE  NORM  RELATIVE  ERROR 

(100  SAMPLES.  100  RUNS)  _ 


K 

IQ’2 

10  3 

T 

o 

vH 

10~5 

10-6 

A 

5.6312 

77.317 

70.539 

69.907 

68.783 

8.5002 

13.443 

12.567 

12.522 

7.484 

3.4455 


4.9856 


4.8199 


4.6672 


3.0990 
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Table  5.8.  SAMPLE  RELATIVE  BIAS  (  7)e  ) 
(100  SAMPLES.  100  RUNS) 


K 

A 

10-2 

10-3 

T 

o 

lO"5 

10^ 

10-2 

2.6277  X  10_1 

6.7566  X  10_1 

6.5280  X  10"1 

6.5049  X  10-1 

6.0676  X  I0_i 

10_1 

3.2896  X  10'1 

4.1521  X  10-1 

4.0252  X  10-1 

4.0185  X  10'1 

2.7972  X  10'1 

1 

2.3146  X  10_1 

2.3049  X  10_1 

1.8362  X  10_1 

Table  5.9.  WORST-CASE  CONVERGENCE  ITERATION  VALUE  (  p  ) 

(100  SAMPLES.  100  RUNS) 


K 

A 

10~2 

10"3 

10-4 

10-5 

10^ 

10'2 

4.98 

14.80 

6.78 

4.22 

4.25 

10-1 

22.24 

10.70 

8.70 

8.53 

7.82 

27.98 


23.86 


22.73 


22.62 


20.38 
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of  rate  of  convergence.  Consequently,  it  yields  an  excellent  estimator  for  small  sample  sizes  for  the 
two-parameter  estimation  problem  as  well. 

5.6.  Conclusions 

In  summary,  we  have  seen  in  this  chapter  that  the  EM  algorithm  uses  the  properties  of  the 
incomplete-date  likelihood  function  in  the  estimation  process,  and  so  is  ideally  suited  for  estimat¬ 
ing  the  parameters  of  mixture  densities  such  as  the  Class  A  density.  It  has  many  desirable  features, 
e.g..  its  monotonicity  property  for  the  likelihood  sequence  and  its  estimation  potential  for  small 
sample  sizes.  Moreover,  within  the  context  of  estimating  the  Class  A  parameters,  we  have  shown 
that,  for  the  single-parameter  estimation  problem,  the  sequence  of  EM  estimates  converges. 
Furthermore,  we  have  shown  that  if  the  limit  point  to  which  the  sequence  converges  is  an  interior 
point  of  the  compact  set  over  which  the  maxmization  is  performed,  then  it  must  necessarily  be  a 
stationary  point  of  the  likelihood  function  of  interest.  Using  an  implementation  based  on  the  exe¬ 
cution  of  two  EM  algorithms  in  parallel,  an  extensive  simulation  study  for  both  the  single¬ 
parameter  and  two-parameter  estimation  problems  was  also  performed.  The  results  of  these  stu¬ 
dies  indicate  that  the  proposed  EM  estimator  does,  in  fact,  yield  an  excellent  estimator  of  the  Class 
A  parameters  for  small  sample  sizes  for  all  parameter  vectors  in  the  parameter  set  of  interest. 
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6.  SUMMARY  AND  CONCLUSIONS 

In  this  work,  we  have  developed  and  examined  the  performance  of  various  optimal  and  near- 
optimal  identification  procedures  for  the  Class  A  interference  model. 

We  considered  first  the  problem  of  basic  batch  estimation  of  the  Class  A  parameters  from  an 
independent  sequence  of  Class  A  envelope  samples.  From  this  study,  it  was  seen  that  the  method- 
of-moments  estimator  based  on  the  fourth  and  sixth  moments  yielded  strongly  consistent  and 
asymptotically  normal  estimates  of  the  parameters,  but  was  highly  inefficient  due  to  the  insensi¬ 
tivity  of  the  moments  to  changes  in  the  parameter  K.  However,  via  an  examination  of  the 
Cramer-Rao  Lower  Bound,  it  was  seen  that  a  tremendous  improvement  in  performance  over  the 
method-of-moments  estimator  was  possible  if  an  asymptotically  efficient  estimator  could  be  found. 
Unfortunately,  maximum  likelihood  estimation  proved  to  be  computationally  unwieldy  due 
largely  to  the  multiplicity  of  roots  in  the  likelihood  equation.  However,  by  initiating  Newton’s 
root-finding  method  on  the  likelihood  equation  with  the  method-of-moments  estimator,  a  pro¬ 
cedure  was  obtained  that  combined  the  consistency  and  efficiency  of  the  two  approaches.  Despite 
its  asymptotic  optimality,  this  Moment/LLkelihood  procedure  did  not  perform  well  for  moderate 
sample  sizes  because  of  the  extremely  high  inefficiency  of  the  moments  estimator.  However,  by  ini¬ 
tiating  likelihood  search  with  the  more  efficient,  physically-motivated  Threshold-Comparison  esti¬ 
mator.  a  batch  estimator  of  the  Class  A  parameters  was  obtained  that  yields  good  estimates  of  the 
parameters  for  moderate  sample  sizes. 

The  problem  of  recursive  identification  of  the  Class  A  parameters  was  then  addressed.  The 
starting  point  in  the  development  of  a  global,  recursive  estimator  of  the  parameters  was  the  BDD 
algorithm.  This  algorithm  is  physically  motivated,  easy  to  implement,  and  is  a  recursive  version  of 
the  Threshold-Comparison  estimator,  which  was  seen  in  the  batch  estimation  problem  to  yield 
accurate  estimates  of  the  parameters.  In  particular,  this  basic  decision-directed  algorithm  is  based 
on  an  adaptive.  Bayesian  classification  of  each  of  a  sequence  of  Class  A  envelope  samples  as  an 
impulsive  sample  or  as  a  background  sample.  As  each  sample  is  so  classified,  recursive  updates  of 
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the  estimates  of  the  second  moment  of  the  background  component  of  the  interference  envelope  den¬ 
sity,  the  second  moment  of  the  impulsive  component  of  the  interference  envelope  density,  and  the 
probability  with  which  the  impulsive  component  occurs,  are  readily  obtained.  From  these  esti¬ 
mates.  estimates  of  the  parameters  of  the  Class  A  model  follow  straightforwardly,  since  closed- 
form  expressions  exist  in  terms  of  these  quantities.  Examination  of  the  performance  characteristics 
of  the  algorithm  revealed  two  inherent  drawbacks  of  the  algorithm,  which  adversely  affect  its  per¬ 
formance  even  locally.  However,  by  imposing  the  necessary  restrictions  on  the  form  of  the  initia¬ 
tion  vector  for  the  algorithm  and  incorporating  the  appropriate  modifications  into  its  framework, 
the  ensuing  difficulties  associated  with  these  two  drawbacks  can  be  eliminated.  The  result  was  a 
global,  recursive  estimator  of  the  Class  A  parameters  that  yields  excellent  estimates  of  the  parame¬ 
ters  for  all  parameter  vectors  in  the  parameter  set  of  interest. 

The  problem  of  efficient  estimation  of  the  Class  A  parameters  for  small  sample  sizes  was  then 
considered  Tv-  proposed  estimator  was  based  on  the  FM  algorithm,  a  two-step  iterative  estimation 
technique  which  was  ideally  suited  for  the  Class  A  estimation  problem  since  the  observations  could 
be  readily  treated  as  "incomplete  data."  For  the  single-parameter  estimation  problem  (A  unknown, 
K  known),  a  closed-form  expression  f^r  the  estimator  was  obtained.  The  convergence  properties  of 
the  EM  algorithm  as  they  pertain  to  the  Class  A  estimation  problem  were  also  examined.  Again, 
for  the  single-parameter  estimation  problem,  (A  unknown,  K  known),  it  was  shown  that  the 
sequence  of  estimates  obtained  via  the  EM  algorithm  converges.  Moreover,  it  was  shown  that  if  the 
limit  point  to  which  the  sequence  converges  is  an  interior  point  of  the  set  over  which  the  optimiza¬ 
tion  is  performed,  then  it  must  necessarily  be  a  stationary  point  of  the  traditional  likelihood  func¬ 
tion.  The  small-sample-size  performance  of  the  EM  algorithm  was  also  examined  via  simulation 
(for  both  the  single-parameter  and  two-parameter  estimation  problems).  For  each  true  parameter 
vector,  two  EM  algorithms  were  executed  in  parallel,  each  initiated  with  a  different  initiating  vec¬ 
tor.  the  pair  of  initiating  vectors  being  fixed  for  each  true  parameter  vector.  For  each  initiating  vec¬ 
tor,  the  EM  algorithm  converged  to  a  limit  vector  in  the  parameter  set  of  interest,  and  the  estimate 
vector  was  then  taken  to  be  the  limit  vector  that  maximized  the  incomplete-data  likelihood 
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function.  For  each  initiating  vector  and  true  parameter  vector,  the  EM  algorithm  converged  to  the 
limit  vector  within  relatively  few  iterations  (on  the  order  of  10).  Moreover,  via  an  extensive  simu¬ 
lation  study,  it  was  seen  that  this  likelihood-based  scheme  yields  excellent  estimates  of  the  parame¬ 
ters  of  the  Class  A  model  (in  terms  of  attaining  the  Cramer-Rao  Lower  Bound)  for  small  sample 
sizes  (on  the  order  of  1CT). 

This  study  has  been  devoted  to  the  problem  of  obtaining  optimal  and  near-optimal 
identification  procedures  for  the  parameters  of  the  Clas^  A  interference  model.  A  thorough  investi¬ 
gation  of  this  problem  has  been  made,  with  the  objective  of  obtaining  theoretically  optimal  and 
practically  efficient  estimation  procedures  for  these  parameters.  The  problem  of  efficient  estimation 
in  both  the  batch  and  recursive  frameworks  has  been  addressed.  It  is  anticipated  that  the  results  of 
this  study  will  find  widespread  application  in  the  areas  of  digital  communications,  sonar,  and  radar 
due  to  the  common  occurrence  of  impulsive  channels  in  such  systems. 
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APPENDIX  A.  DERIVATION  OF  RESTRICTIONS  ON  BDD  ALGORITHM 


Let  us  now  determine  the  source  of  the  two  flaws  of  the  BDD  algorithm  given  in  Section  4.2.2. 
Xote  that,  in  a  given  iteration  of  the  algorithm,  the  only  error  that  can  be  incurred  is  in  the 
decision-making  process,  and  the  decision-making  process  is  based  on  a  LRT  that  uses  an  estimate 
of  the  true  parameter  vector  in  the  LR  to  classify  the  given  sample.  Thus,  by  examining  the  effects 
on  the  error  probability  in  the  classification  process  due  to  the  mismatch  introduced  from  a  lack  of 
knowledge  of  the  true  parameter  vector  value,  some  insight  can  be  gained  into  the  source  of  the 
two  flaws  of  the  BDD  algorithm.  In  particular,  for  a  comprehensive  range  of  values  of 
(.4  K .  cr2)  6  A  and  (A.  A.  a2)  €  A\  we  can  compute  the  probability  of  making  an  incorrect 
classification  when  ( A.K.cf 2)  is  used  in  the  LRT  to  classify  the  given  observation  given  that 
{A  .  K .  cr2)  is  the  true  parameter  vector.  The  expression  for  this  probability  of  error  can  be  easily 
derived  by  noting  from  the  definition  of  ~  ~  )  given  in  Section  4.2.1  that  this  is  simply  the 
probability  of  making  an  incorrect  classification  when  the  decision  regions 
(0.  T°p j----  '),  \  oo)  are  used,  given  that  (A  .  K .  cr2)  is  the  true  parameter  vector.  Let  us 

denote  this  probability  of  error  by  Pe  ((A  ,  K .  cr2)  ;  (A  .  K  ,  cr2)).  (In  the  sequel,  the  parenthetical 
arguments  will  be  included  only  if  necessary  for  clarity.)  We  then  have  that 


P.  ((A  .  K .  V2) ;  (A  .  K  .  cr2)) 


=  e 


- A 


r(4*£*£  > 

opt 


f  2  p0(z;  A  .  K,cr2)dz  +(l-e  A)J  p  x(z  :  A  .  K  ,  cr2)  dz 


T  U ,  X ,  a 


=  e 


-A 


r°°  2  z 

A  +  K 
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K 
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l  K 
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r(A.A,< 7J) 

A  +  K 

A  +K  j 

}J  - 
J  0 

2e  £  Am 

z  e  <T2 

m  +  K  j 

cr2<il-e-A)„=lm\ 

m  +  K 

(A.l) 


dz 


After  some  manipulation.  (A.l)  yields  the  following  expression  for  Pt  ((A  ,  K  ,  cr2)  ;  (A  .  K  ,  c2)): 
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P.  ((A.K.a2):(A.K,  a2)) 


-A 

e  e 


(r(d.x.£>¥ 

.r2 

X 

+  (  1  —  e  _A  ) 


-  z 


go  —A  ,  m 

e  A 


(  Cd.x.£jy 

Oft 

A+X 

<rJ 

m  +  K 

(A.2) 


m  =1 


m ! 


Let  ft  A  {(  a.  0.  £)  :  (  a.  0.  £)  €  A  and  log  a  6  Z  .  log  0  6  Z  ,  log  £  =  0}.  Given  the  above  expression 
for  Pt .  we  have  computed,  for  each  (A  .  K .  cr2)  6  ft,  .Pe  ((A  ,  K ,  cr2)  ;  (A  ,  K ,  cr2))  for  all 
(  A.K .  a2)  €  ft.  It  is  evident  from  (A.2)  that  the  evaluation  of  these  error  probabilities  requires 

that  first  be  evaluated  for  all  (  A ,  IC .  cr2)  6  ft.  Since  /  ( z'.A.K.gi )  is  a  strictly 

monotone  increasing  function  of  z  for  each  (  A  ,  K .  a  2)  6  A' .  this  was  easily  done  using  a  numeri¬ 
cal  search  procedure.  The  results  are  tabulated  in  Table  A.l.  (All  tables  referenced  in  this  appen¬ 
dix  can  be  found  at  the  end  of  the  appendix.)  Using  the  tab  ala  ted  values  for  )  and  (A.2). 

Pe  ((A  .  K .  a2)  :  (A  .  K .  cr2))  was  then  computed  for  the  aforementioned  values  of  {A.K,  cr2)  and 
(A  .  K .  cr2).  The  computed  error  probabilities  are  tabulated  in  Tables  (A.2)-(A.6),  (A.7)-(A.ll), 
and  (A.12)-(A.l6).  We  will  now  use  these  error  probabilities  to  explain  the  observed  difficulties  of 
the  BDD  algorithm  cited  above. 


Explanation  of  Observed  Difficulties 

We  will  now  justify  the  observed  difficulties  of  the  BDD  algorithm,  and,  in  so  doing,  obtain 

restrictions  on  the  form  of  the  initiation  vector  (A0,  K0.  o’q)  for  the  algorithm.  As  stated  in  Sec- 

-  2 

tion  4.2.2.  the  observed  difficulties  of  the  algorithm  are  cited  for  the  case  when  c rn  is  fixed  so  that 
<xa  =  cr  =1  for  all  n  ^  0  and,  thus,  will  be  justified  for  this  case. 

Consider  the  tables  for  P€ .  Since  T  A  £/A  (see  Eq.  (2.6)),  we  see  that  for  a  given  table,  the 
lines  of  constant  r.  correspond  to  diagonals  along  that  table,  with  r.  attaining  a  maximum  value 
of  1  when  {A.K)  —  (10-2.  10-2)  and  a  minimum  value  of  10-6  when  (  A  .  K )  =  (1  ,  lCf6).  Note 
from  tables  (A.2)-(A.6)  and  (A.7)-(A.ll)  that  for  fixed  A  ,  Pe  ((A  ,  K ,  a;2)  ;  (A  .  K .  cr2))  becomes 
very  large  very  rapidly  with  decreasing  K  for  parameter  vectors  (A  .  K.S.)  for  which  T.  is  less 
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than  T.  Moreover,  for  values  of  the  parameter  vectors  ( A,K .  <x2)  for  which  T,  ^  I*.  Pe  for  these 
parameter  vectors  is  relatively  small.  Thus,  we  have  a  so-called  "diagonal  effect."  wherein  for 
values  of  the  parameter  vectors  (A  .  K ,  cr2)  for  which  £.  <  I\  the  probability  of  error  incurred  in 
the  decision-making  process  by  using  these  vectors  in  the  LRT  when  (A  .  K ,  cr2)  is  the  true  parame- 
ter  vector  is  relatively  large,  whereas  for  values  of  the  parameter  vectors  (A  .  K  ■  <L  )  for  which 
T  >  T,  the  probability  of  error  incurred  is  relatively  small.  Thus,  for  each  {A  ,  K ,  cr2).  the  "diago¬ 
nal  effect"  divides  the  plane  of  A  consisting  of  parameter  vectors  for  which  the  third  coordinate 
has  unity  value  into  an  upper-diagonal  and  a  lower-diagonal  region  as  defined  by  the  line 
corresponding  to  constant  T.  To  parameter  vectors  in  the  upper-diagonal  region,  we  can  associate 
values  of  Pt  that  are  predominantly  small,  and  to  those  in  the  lower-diagonal  region,  we  can  asso¬ 
ciate  values  of  Pt  that  are  predominantly  large. 

This  phenomenon  justifies  the  initial  portion  of  the  observation  cited  in  (ii)  of  Section  4.2.2, 
namely,  that  for  initiation  vectors  (A0.  K 0.  cr0 )  for  which  T0  ^  T,  the  frequency  with  which  the 
algorithm  converges  to  the  true  parameter  vector  is  relatively  high,  whereas  for  initiation  vectors 
( A0 ,  K Q,  <x02)  for  which  f0<  T.  the  frequency  with  which  the  algorithm  converges  to  the  wrong 
parameter  vector  is  relatively  high.  We  see  then  that  the  diagonal  effect  can  be  used  advanta- 
geously  by  always  initiating  the  BDD  algorithm  with  an  initiation  vector  (A0,  K0,  cr0)  for  which 

(1)  f0  either  provides  an  accurate  estimate  of  T,  or.  T0  provides  an  estimate  of  T  for  which 

f0>  r. 

hat  for  each  true  parameter  vector  (A  ,  K  .  cr2)  the  upper-diagonal  and  lower-diagonal  regions 
u.  defined  by  the  line  of  constant  T  have  the  aforementioned  properties  of 
Pt  (ti-  A .  <r  )  ;  (A  .  K  .  cr  ))  associated  with  them  if  ^  =  cr  and  cr  =1.  The  constraint  given  by 
the  first  equality  demands  that  the  restriction  given  in  (l)  be  accompanied  by  the  additional 


restriction  that 
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¥"*  2  2 
(2)  cr0  must  provide  an  accurate  estimate  of  cr  . 

^2  2  2 

Given  that  <x0  accurately  estimates  <J  ,  we  will,  in  addition,  replace  (4.8c)  with  an  estimator  of  cr 

-  2  2 

consisting  of  an  update  equation  for  cr0.  In  so  doing,  we  will  obtain  an  estimate  equation  for  cr 

2 

which  yields  an  accurate  estimate  of  cr  at  each  iteration  of  the  algorithm. 

The  second  constraint,  namely  cr  =1,  does  not  impose  any  additional  restrictions  for  the  fol¬ 
lowing  reason:  From  the  approximation  given  in  Step  (2')  of  the  MBDD  algorithm  (Section  4.3.1), 
it  can  be  seen  that,  via  some  minor  approximations,  the  dependence  of  (r0^~’  on  cr2  is  essen¬ 

tially  linear  for  all  (A  .  K .  cr2)  €  A.  From  (A.2),  it  then  follows  that  Pe  ((.A  .  K ,  <r2)  ;  (.A  ,  K ,  cr2)) 
depends  only  on  the  ratio  of  cr  to  cr  .  i.e..  given  that  cr  =  cr  ,  the  values  for 
P€  ((A  .  K .  cr2)  ;  (A  .  K  .  cr2))  are  essentially  invariant  to  changes  in  cr2  for  each  true  parameter  vec¬ 
tor  (A  .  K .  cr2).  Consequently,  since  restriction  (2)  takes  into  account  the  fact  that  the  properties 

2  2  2 

for  Pt  are  valid  only  when  cr  =  cr  .  no  additional  restriction  on  cr  need  be  imposed. 

Note,  in  addition,  from  Tables  (A.7)-(A.ll)  that  for  each  true  parameter  vector  (A  .  K .  cr 2) 
for  which  A  =  1.  there  exists  a  set  of  vectors  (A  ,  K ,  <x2)  for  which  T.  >  T  but  for  which  Pt  once 
again  becomes  relatively  large,  namely,  the  set  of  vectors  for  which  JT  is  approximately  on  the 
order  of  10-1  or  larger.  The  values  for  Pt  tabulated  for  these  T.  support  the  observation,  cited  in 
(ii)  of  Section  4.2.2,  that  for  values  of  A  close  to  1,  the  frequency  with  which  the  BDD  algorithm 
converges  to  the  wrong  parameter  vector  becomes  relatively  high  for  values  of  T0  approximately  on 
the  order  of  10-1  or  larger.  Instead  of  translating  this  observation  into  a  restriction  on  T0  as  was 
done  with  the  diagonal  effect,  we  will  see  in  the  sequel  that  this  observation  more  readily  translates 
into  a  restriction  on  A  0. 

Consider  the  observations  cited  in  (i)  of  Section  4.2.2.  Suppose  that  we  have  an  iteration  n" 
for  which  the  associated  proportion  of  samples  classified  as  impulsive  exceeds  the  expected  percent¬ 
age  (and  for  which  all  samples  at  iterations  n  ^  n"  have  been  correctly  classified).  (In  the  sequel, 
this  will  be  referred  to  as  a  situation  wherein  there  is  a  "disproportionate  excess  of  impulses.")  Let 
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us  see  how  this  can  cause  the  algorithm  to  behave  poorly:  Since  the  proportion  of  samples  classified 
as  impulsive  at  iteration  n"  exceeds  the  expected  percentage,  it  follows  from  (4.8a)  that  the  esti¬ 
mate  of  A  at  that  iteration  will  exceed  the  true  A  .  Now,  if  the  estimate  of  K  at  iteration  n"  is 
such  that  the  estimate  of  T  (  which  is  defined  to  be  the  estimate  of  K  divided  by  the  estimate  of 
A  )  at  that  iteration  either  exceeds  or  equals  the  true  T.  then  the  estimate  vector  at  iteration  n" 
will  lie  in  the  upper-diagonal  region  corresponding  to  the  true  parameter  vector.  Consequently,  by 
the  "diagonal  effect."  the  probability  of  error  incurred  in  the  decision-making  process  at  the  next 
iteration  of  the  algorithm  will  tena  to  be  relatively  low,  and  thus,  with  relatively  high  probability, 
the  sample  at  iteration  n"+  1  will  be  classified  correctly.  However,  if  the  estimate  of  K  at  iteration 
n"  is  such  that  the  estimate  of  T  is  less  than  the  true  T,  then  the  estimate  vector  at  iteration  n" 
will  lie  in  the  lower-diagonal  region  corresponding  to  the  true  parameter  vector  and.  by  the  "diago¬ 
nal  effect."  the  probability  of  error  incurred  at  the  next  iteration  of  the  algorithm  will  tend  to  be 
relatively  high.  Now.  from  Table  A.l.  we  see  that  if  the  estimate  vector  is  such  that  the  estimate 
of  A  is  greater  than  the  true  A  and  the  estimate  of  T  is  less  than  the  true  T,  then  the  optimum 
threshold  corresponding  to  the  estimate  vector  will  be  less  than  the  optimum  threshold  correspond¬ 
ing  to  the  true  parameter  vector.  Given  this  and  the  relatively  high  probability  of  an  incorrect 
classification  associated  with  the  estimate  vector  lying  in  the  lower-diagonal  region,  we  have  a 
situation  wherein  a  background  sample  can  be  incorrectly  classified  as  an  impulsive  sample  at  the 
next  iteration  of  the  algorithm.  Since  the  estimate  of  A  is  directly  proportional  to  the  proportion 
of  samples  that  have  been  classified  as  impulsive  (of  course,  within  the  boundary  restrictions 
imposed  by  A),  an  incorrect  classification  of  a  background  sample  as  impulsive  at  iteration  n"+  1 
will  raise  the  estimate  of  A  even  further.  Thus,  via  these  successively  increasing  values  for  the 
estimate  of  A  .  the  estimate  vector  can  potentially  be  "forced  away"  from  the  true  parameter  vec¬ 
tor.  We  see  then  that  a  disproportionate  excess  of  impulses  at  a  given  iteration  of  the  algorithm  can 
have  a  detrimental  effect  on  its  performance. 

Note  that  the  likelihood  with  which  the  performance  of  the  algorithm  is  adversely  affected 
increases  with  increasing  values  obtained  for  the  estimate  of  A  ,  given  that  the  estimate  of  A 
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exceeds  the  true  value.  This  is  accounted  for  as  follows:  Consider  Tables  (A.4)-(A.6)  and  (A.7)- 
(A.ll).  In  each  table,  note  that  for  a  fixed  but  arbitrary  value  of  A.  the  portion  of  the  lower- 
diagonal  region  consisting  of  the  parameter  vectors  for  which  the  first  coordinate  is  A  is  an  inter¬ 
val,  and  the  length  of  this  interval  increases  with  increasing  A.  Thus,  as  the  estimate  of  A 
increases,  the  length  of  the  interval  associated  with  this  estimate  increases.  Since  this  interval 
defines  the  set  of  parameter  vectors  under  consideration  for  which  the  first  coordinate  is  the  esti¬ 
mate  of  A  and  for  which  the  ratio  of  the  second  coordinate  to  the  first  is  less  than  the  true  T,  it 
follows  that  as  the  estimate  of  A  increases,  the  likelihood  that  the  estimate  of  T  will  be  less  than 
the  true  T  increases.  Given  the  ensuing  detrimental  effect  on  the  behavior  of  the  algorithm 
(described  in  the  previous  paragraph)  associated  with  the  estimate  of  A  being  greater  than  the  true 
A  and  the  estimate  of  T  being  less  than  the  true  T.  it  follows  that  the  increased  likelihood  of  the 
estimate  of  T  being  less  than  the  true  T  in  turn  increases  the  likelihood  that  the  performance  of  the 
algorithm  will  be  adversely  affected. 

It  was  noted  in  (i)  of  Section  4.2.2  that  the  convergence  of  the  BDD  algorithm  is  sensitive  to 
the  distribution  of  impulses  in  the  initial  stages  of  the  algorithm.  That  a  disproportionate  excess  of 
impulses  at  a  given  iteration  of  the  algorithm  is  more  likely  to  affect  its  convergence  if  the  iteration 
occurs  in  the  initial  stages  of  the  algorithm  follows  from  the  fact  that  the  estimates  of  the  update 
parameters  and.  hence,  the  estimates  of  the  model  parameters,  are  more  sensitive  to  changes  in  the 
number  of  samples  classified  as  impulsive  in  the  initial  stages  of  the  algorithm.  For  example,  sup- 
pose  that  the  true  value  of  the  parameter  A  is  10  .  Since  the  corresponding  true  value  for  vl  is 
9.95X10-3.  suppose,  in  addition,  that  tTjCIOO)  =  9.95X10-3.  Now,  if  the  sample  at  the  101-st  itera¬ 
tion  is  classified  as  impulsive,  then  w,1(10l)  will  be  1.98X10-2.  However,  if  tt1(2 000)  =  9.95x10  3 
and  the  sample  at  the  2001-st  iteration  is  classified  as  impulsive,  then  7r1(2001)  will  be  1.04x10  2. 
Thus,  even  though  the  proportions  of  samples  classified  as  impulsive  at  the  100-th  and  2000-th 
iterations  are  the  same  (and  are  equal  to  the  true  value),  an  increase  in  the  number  of  samples 
classified  as  impulsive  by  one  (resulting  in  a  disproportionate  excess  of  impulses)  yields  a  larger 
value  for  the  estimate  of  ,rr1  at  the  101-st  iteration,  and  hence,  a  larger  value  for  the  estimate  of  A  , 
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than  at  the  2001-st  iteration  (A  101  =  2.00X10-2  whereas  A2001  =  1.05X10-2).  Since  the  estimate  of 
A  associated  with  the  101-st  iteration  is  larger  than  that  corresponding  to  the  2001-st  iteration 
(and  both  exceed  the  true  value),  it  follows  from  the  observation  cited  in  the  first  statement  of  the 
previous  paragraph  that  the  performance  of  the  algorithm  is  more  likely  to  be  adversely  affected 
by  the  estimate  of  A  associated  with  the  101-st  iteration.  Via  this  example,  we  see  then  that  a 
disproportionate  excess  of  impulses  at  a  given  iteration  of  the  algorithm  is  more  likely  to  affect  its 
convergence  if  the  iteration  occurs  in  the  initial  stages  of  the  algorithm. 

It  was  also  noted  in  (i)  of  Section  4.2.2  that  a  disproportionate  excess  of  impulses  will  have  a 

“2 

significant  effect  on  the  convergence  of  the  algorithm  for  values  of  A  close  to  10  .  That  a  dispro¬ 
portionate  excess  of  impulses  is  more  likely  to  affect  the  convergence  of  the  algorithm  for  smaller 
values  of  A  is  best  explained  with  an  example:  Suppose  that  the  true  value  of  the  parameter  A  is 
10  and  suppose  that  the  first  impulsive  sample  occurs  at  the  second  iteration  of  the  algorithm. 
Then,  given  that  the  samples  have  been  classified  correctly,  there  will  be  a  disproportionate  excess 
of  impulses  at  this  second  iteration.  Furthermore,  the  minimum  possible  iteration  value  for  which 
there  will  not  be  a  disproportionate  excess  of  impulses  is  101  (such  would  be  the  case  if  there  are 
no  impulses  for  iteration  values  3  ^  n  ^101  and  all  samples  are  classified  correctly).  Now  if  the 
true  value  of  the  parameter  A  is  10  1  and  the  first  impulsive  sample  occurs  at  the  second  iteration 
of  the  algorithm,  then  again  there  will  be  a  disproportionate  excess  of  impulses  at  this  second  itera¬ 
tion  but  the  minimum  possible  iteration  value  thereafter  for  which  there  will  not  be  a  dispropor¬ 
tionate  excess  of  impulses  is  now  11.  Thus,  an  iteration  for  which  there  is  a  disproportionate  excess 
of  impulses  has  resulted  in  a  greater  number  of  ensuing  iterations  for  which  there  is  a  dispropor¬ 
tionate  excess  of  impulses  for  the  smaller  value  of  A  .  Given  the  possible  detrimental  effect  on  the 
performance  of  the  algorithm  associated  with  an  iteration  for  which  there  is  a  disproportionate 
excess  of  impulses  (discussed  above),  it  follows  that  an  increased  number  of  iterations  for  which 
there  is  such  an  excess  increases  the  likelihood  that  the  algorithm  will  behave  poorly.  We  see  then 
that  a  disproportionate  excess  of  impulses  at  a  given  iteration  of  the  algorithm  is  more  likely  to 
affect  its  convergence  for  smaller  values  of  A  . 
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Finally,  let  us  justify  the  last  observation  cited  in  (i)  of  Section  4.2.2,  namely,  that  for  a  fixed 
value  of  the  true  parameter  A  ,  the  likelihood  that  a  disproportionate  excess  of  impulses  will  afFect 
the  convergence  of  the  algorithm  increases  with  increasing  K :  Consider  Tables  (A.2)-(A.6)  and 
(A.7)-(A.ll).  For  each  true  parameter  vector,  let  the  "truncated  lower-diagonal  region"  and  "trun¬ 
cated  upper-diagonal  region’  denote  the  set  of  parameter  vectors  in  the  lower-diagonal  region  and 
upper-diagonal  region,  respectively,  for  which  the  first  coordinate  exceeds  the  true  A  .  Note  that  for 
fixed  value  of  the  true  parameter  A  and  increasing  value  of  the  true  parameter  K ,  there  is  a 
corresponding  increase  in  the  size  of  the  truncated  lower-diagonal  region  associated  with  the  true 
parameter  vector  and  a  decrease  in  the  size  of  the  truncated  upper-diagonal  region.  Now,  suppose 
that  we  have  an  iteration  for  which  there  is  a  disproportionate  excess  of  impulses.  Then  the  esti¬ 
mate  of  A  at  that  iteration  will  exceed  the  true  A  and,  consequently,  the  estimate  vector  at  that 
iteration  will  either  lie  in  the  truncated  lower-diagonal  region  or  in  the  truncated  upper-diagonal 
region  corresponding  to  the  true  parameter  vector.  Since  the  size  of  the  truncated  lower-diagonal 
region  increases  with  increasing  K ,  the  probability  that  the  estimate  vector  will  lie  in  the  truncated 
lower-diagonal  region  increases  with  increasing  K .  Moreover,  the  estimate  vector  is  more  likely  to 
lie  in  the  truncated  lower-diagonal  region  associated  with  the  true  parameter  vector  than  in  the 
truncated  upper-diagonal  region  since  the  size  of  the  first  region  relative  to  the  second  increases 
with  increasing  K .  Given  the  detrimental  effect  on  the  performance  of  the  algorithm  (discussed 
earlier)  that  results  from  the  estimate  of  A  being  greater  than  the  true  A  and  the  estimate  of  T 
being  less  than  the  true  I\  it  follows  from  the  increased  likelihood  of  the  estimate  vector  lying  in 
the  truncated  lower-diagonal  region  (and  decreased  likelihood  of  it  lying  in  the  truncated  upper- 
diagonal  region)  that  there  will  be  an  increased  likelihood  of  the  algorithm  performing  poorly. 

—2 

For  fixed  A  .  the  most  pronounced  effect  on  the  algorithm’s  convergence  is  for  K  =  10  since 
the  truncated  lower-diagonal  region  associated  with  the  true  parameter  vector  is  largest  for  this 
value  of  K .  Moreover,  the  truncated  upper-diagonal  region  is  an  empty  set.  This  suggests  a  means 
for  improving  the  performance  of  the  algorithm  somewhat  for  larger  values  of  K ,  namely  by 
extending  the  upper-diagonal  region  (and.  hence,  the  truncated  upper-diagonal  region)  associated 
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with  the  parameter  vectors  corresponding  to  these  larger  values  of  K .  This  can  be  done  by  consid¬ 
ering  a  modified  parameter  set  consisting  of  the  following  parallelepiped: 
{(A  .  K  ,<j  )  :  10  ^  A  ^  1 ,  lO-6^  r  1  .  o-2  >  0}.  However,  given  that  A  is  the  parameter  set 
of  practical  interest  for  the  estimation  problem  at  hand,  another  means  of  improving  the  perform¬ 
ance  of  the  algorithm  suggests  itself  by  recalling  that  (a)  a  disproportionate  excess  of  impulses 
forces  the  estimate  vector  away  from  the  true  parameter  vector  through  successively  increasing 
values  in  the  estimate  of  A  .  and  that  (b)  the  performance  of  the  algorithm  is  more  likely  to  be 
affected  if  the  disproportionate  excess  of  impulses  occurs  in  the  initial  stages  of  the  algorithm. 
From  these  earlier  observations,  we  see  then  that  an  improvement  in  performance  is  possible  if.  in 
the  initial  stages  of  the  algorithm,  the  estimate  of  A  is  fixed  to  its  initial  value  A  0  and  only  the 
estimates  of  K  and  c t  are  updated.  In  this  manner,  the  divergence  of  the  estimate  of  A  from  its 
true  value  can  be  prevented,  and  this,  in  turn,  should  increase  the  frequency  with  which  the  algo¬ 
rithm  converges  to  the  true  parameter  vector.  To  accommodate  this  modification  of  the  BDD  algo¬ 
rithm.  the  following  restriction  on  A  0  must  be  imposed:  Namely, 

(3)  A  o  must  either  provide  an  accurate  estimate  of  A  ,  or,  A  0  must  provide  an  estimate  of  A  for 
which  A  0<  A  . 

Suppose  this  was  not  the  case,  i.e.,  suppose  that  A  0  were  significantly  larger  than  the  true  A  .  Now. 
restrictions  (1)  and  (2)  on  T0  and  <70,  respectively,  guarantee  that  the  initiation  vector 
(A  q,  K q.  crQ)  for  the  algorithm  will  lie  in  the  upper-diagonal  region  corresponding  to  the  true 
parameter  vector.  Thus,  the  probability  of  error  incurred  in  the  decision-making  process  will  be 
relatively  small  initially.  Consequently,  with  relatively  high  probability,  the  impulsive  and  back¬ 
ground  samples  will  be  correctly  classified  and,  eventually,  this  results  in  an  estimate  of  K  which 
accurately  approximates  the  true  value.  (The  estimate  of  cr2  will  always  accurately  approximate 
the  true  value  for  the  reasons  discussed  above.)  Eventually,  then,  we  have  a  situation  where  the 
estimate  of  A  is  fixed  at  its  initial  value  A  0  and  the  estimates  of  K  and  cr2  accurately  describe  the 
true  values,  i.e.,  the  estimate  vector  of  the  true  parameter  vector  lies  in  the  lower-diagonal  region 
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corresponding  to  the  true  parameter  vector.  Thus,  the  probability  of  error  associated  with  this  esti¬ 
mate  vector  will  be  relatively  high  and  this  may  adversely  affect  the  estimate  of  K  and/or  the  esti¬ 
mate  of  A  (if  the  algorithm  is  at  the  stage  where  the  estimate  of  A  is  once  again  being  updated). 
Given  these  ensuing  adverse  effects  on  the  estimates  of  the  parameters  resulting  from  A0  being 
larger  than  the  true  A  .  we  necessarily  impose  the  restriction  given  in  (3). 

Recall  our  earlier  observation  that  for  parameter  vectors  {A  .  K ,  cr2)  for  which  A  =  1 
(cr2=  1).  Pt  (.(A  .  K  .  <x2)  '  ( A  .  K .  cr2))  becomes  relatively  large  for  parameter  vectors  ( A.K .  cr2) 
(cr2=  1)  for  which  JT  is  approi  imately  on  the  order  of  10  1  or  larger.  In  particular,  for  the  true 
parameter  vector  (A  ,  K ,  cr2)  =  (1  . 10  2. 1),  the  vectors  ( A.K.  a ;2)  for  wMch  A  ^  10  l. 
K  =  10-2.  and  c:  =  1  fall  in  this  category.  Note  that  for  these  vectors  the  value  of  A  is  less  than 
A  by  an  order  of  magnitude  or  more.  Thus,  as  this  example  indicates,  if  the  initiation  vector  for 
the  algorithm  has  a  value  for  A0  which  is  less  than  the  true  A  by  an  order  of  magnitude  or  more, 
then  it  is  possible  that  the  probability  of  error  incurred  in  the  decision-making  process  will  be  rela¬ 
tively  large  in  the  initial  stages  of  the  algorithm.  Consequently,  we  replace  condition  (3)  with  the 
more  stringent  condition  that 

(3')  A  o  must  either  provide  an  accurate  estimate  of  A  .  or,  A  0  must  provide  an  estimate  of  A  for 

which  A 0<  A  and  not  less  by  an  order  of  magnitude  or  more. 

Note  that  the  region,  defined  jointly  by  conditions  (1)  and  (3'),  from  which  (A0 ,  ro)  can  be 
chosen  consists  of  a  trapezoid.  Now,  the  BDD  algorithm  directly  provides  initial  estimates  for  the 
parameters  A  and  K ,  not  A  and  T.  Thus,  it  is  necessary  to  translate  restrictions  (l)  and  (3')  on  T0 
and  A0,  respectively,  into  restrictions  on  K0  and  A0.  Given  that  r0  is  the  ratio  of  K0  to  A0. 
restrictions  (1)  and  (3')  do  not  yield  a  restriction  for  K0  in  terms  of  a  simple  inequality,  as  they 
did  for  T0  and  A  0.  Consequently,  instead  of  attempting  to  obtain  estimates  (A  0,  K 0)  such  that  the 
associated  estimates  (A0,  r0)  lie  in  the  aforementioned  trapezoid,  we  will  instead  attempt  to  obtain 
estimates  (A  0.  K 0)  for  which  A  0  satisfies  (3')  and  for  which  K 0  satisfies  the  following  restriction: 
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(l')  K 0  must  either  provide  an  accurate  estimate  of  K .  or.  K 0  must  provide  an  estimate  of  K  for 
which  K0  >  K  , 

i.e.,  the  set  of  vectors  from  which  the  estimates  (A  0.  K 0)  can  be  chosen  is  simply  the  largest  rectan¬ 
gle  contained  in  the  aforementioned  trapezoid.  Note  that  restriction  (1')  on  K 0  is  satisfied  if  we  set 
k 0  to  the  maximum  allowable  value  for  the  parameter  K .  Note,  in  addition,  from  the  tables  given 
for  Pt .  that  for  a  fixed  value  of  A  <  A  .  Pt  ((A  .  K  .  ct2)  ;  (A  .  K  .  cr2))  decreases  with  decreasing 
values  of  K  >  K .  Thus,  to  use  this  property  of  Pe  more  advantageously,  we  will  modify  restrici- 
tion  (1')  as  follows  : 

(1”)  k  o  must  either  provide  an  accurate  estimate  of  K .  or.  k0  must  provide  an  estimate  of  K  for 
which  K  0  >  K  and  not  greater  by  two  orders  of  magnitude  or  more. 

In  summary  then,  we  see  from  Cl"),  (2).  and  (3')  and  the  above  discussion  that  the  following 
restrictions  (on  (A  0.  K0.  cr0  ))  and  modifications  must  be  imposed: 

(Rl)  A0  must  either  provide  an  accurate  estimate  of  A  ,  or.  A  0  must  provide  an  estimate  of  A 
for  which  A 0<  A  and  not  less  by  an  order  of  magnitude  or  more. 

(R2)  K0  must  either  provide  an  accurate  estimate  of  K ,  or.  k0  must  provide  an  estimate  of  K 
for  which  k0>K  and  not  greater  by  two  orders  of  magnitude  or  more. 

(R3)  cr0  must  provide  an  accurate  estimate  of  cr  . 

(Ml)  The  estimator  of  cr  given  by  Eq.  (4.8c)  must  be  replaced  by  an  estimator  of  cr  consisting 

-  2 

of  an  update  equation  for  cr0 . 

The  estimate  of  A  must  be  fixed  to  its  initial  value  A  0  in  the  initial  stages  of  the  algorithm, 

2 

with  only  the  estimates  of  K  and  <r  being  updated. 


(M2) 
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Table  A.l.  FOR  (A.K.g?)  6  Cl 


K 

A 

C4 

1 

O 

yH 

10  3 

T 

o 

tH 

10  5 

10-6 

10~2 

2.1575 

1.0235 

0.3698 

0.1269 

4.2914  X  1 0~2 

10_1 

0.7955 

0.3017 

0.1071 

3.7134  X  10~2 

1.2686  X  10~2 

1 

0.2081 

8.1435  X  10-2 

2.9890  X  10~2 

1.0600  X  10~2 

3.6796  X  10-3 

Table  A.2.  Pe  ((  A  .  K .  a2)  ;  (  A  .  K  .  cr2))  FOR  (A.K.  cr2)  €  Cl  AND 

(A  .  K ,  cr2)  =  (  10-2.  lCf2.  1) 


K 

A 

10  2 

10"3 

10-4 

10'5 

10-6 

10-2 

9.6367X10-4 

0.1220 

0.7531 

0.9587 

0.9864 

10_1 

0.2794 

0.8253 

0.9676 

0.9873 

0.9897 

1 

0.9079 

0.9770 

0.9883 

0.9898 

0.9900 

Table  A.3.  Pt({A.K,  a;2)  .{A.K.  cr2))  FOR  (A.£.or2)€ft  AND 

CA.K.<t2)  =  (  10~2. 10-3. 1) 


K 

A 

10-2 

10"3 

T 

o 

10  5 

10-6 

10-2 

4.9496X10"4 

1.2340X10"4 

0.2199 

0.8294 

0.9702 

10_l 

1.0081X10'3 

0.3637 

0.8726 

0.9751 

0.9883 

1 

0.6148 

0.9204 

0.9804 

0.9888 

0.9899 

Table  A.4.  Pt({A,K,  a2)  ;  (  A  ,  isf ,  <r2))  FOR  (.A.K.g?)  6D  AND 

(A  .  A- .  cr2)  =  (  10-2. 10-4. 1) 


K 

A 

10  2 

10'3 

T 

o 

*o 

1 

O 

rH 

T 

o 

t-H 

10"2 

4.5580X10-4 

1.0445X10-4 

1.4692X10-5 

0.1947 

0.8220 

10_1 

6.3223X10-,S 

1.0964X10-4 

0.3106 

0.8613 

0.9741 

1 

1.2462X10-2 

0.5067 

0.9046 

0.9789 

0.9887 

Ill 


Table  A  .5.  Pt  ((  A  .  K .  a2)  :  (  A  ,  X .  or2))  FOR  (.A.K.  cr2)  €  Cl  AND 

(A  .  K .  a2)  =  C  10“2, 10“5. 1) 


K 

A 

10“2 

i 

o 

v-1 

10““ 

10  5 

10”6 

10“2 

4.5188X10““ 

1.0353X10““ 

— 

1.3580X10'5 

1.6986X10“* 

0.1567 

10"1 

6.2667X10-5 

9.0409X10-6 

1.1281X10-5 

0.2490 

0.8427 

1 

4.3031X10-6 

1.2968X10-3 

0.4048 

0.8847 

0.9767 

Table  A.6.  Pt<X  A  .  K .  a2)  ;(  A  .  K .  cr2))  FOR  (AJ,£2)eil  AND 

(A  .  iT .  cr2)  =  (  10“2. 10“*.  1) 


K 

A 

10  2 

10“3 

10““ 

10  5 

T 

o 

H 

10“2 

4.5148X10““ 

1.0344X10““ 

1.3568X10  5 

1.5980X10“* 

1.9273X10'7 

10_1 

6.2612X10-5 

9.0329X10“* 

1.1392X10“* 

1.1521X10“* 

0.1980 

1 

4.2992X10“* 

6.5826X10-7 

1.3045X10““ 

0.3218 

0.8647 

Table  A.7.  Pt({.A,K.  <r2)  :{A.K,  cr2))  FOR  ( A  .  K  .  ct2)  €  fi  AND 

(A  .  K  ,  cr2)  =  (  10-1. 10-2.  1) 


K 

A 

10  2 

10  3 

10"* 

10"5 

10-6 

10"2 

3.7025X10-2 

1.0023X10-2 

0.2024 

0.7581 

0.8867 

10"1 

7.0416X10"3 

0.3333 

0.7976 

0.8912 

0.9032 

1 

0.5623 

0.8412 

0.8960 

0.9037 

0.9047 

Table  A.8.  Pe((A.K.  cr2)  \  {A.K.  cr2))  FOR  (A.Jf.o;2)6fl  AND 

(A  .K.<r2)  =  (  10~\  10~3,  1) 


K 

A 

10-2 

10  3 

10-4 

— 

10-5 

— 

io-* 

10“2 

3.4883X10-2 

9.3140X10-3 

1.2729X10-3 

0.1781 

0.7513 

10'1 

5. 7418X10-3 

9.4046X10-4 

0.2840 

0.7872 

0.8903 

1 

1.1790X10-2 

0.4632 

0.8268 

0.8946 

0.9036 

Table  A.9.  Pt((.A.K,  <r2)  \  (A.K.  a2))  FOR  (A.£,a;2)€D  AND 

(A  .  <r2)  =  (  10_1,  lO"4. 1) 


K 

A 

10  2 

10"3 

10"4 

10  5 

10"* 

10~2 

3.4662X10-2 

9.2430X10-3 

1.2618X10'3 

1.4951X10-4 

0.1432 

10-1 

5.6972X10-3 

8.4180X10-4 

1.1581X10-4 

0.2276 

0.7702 

1 

4.0151X10-4 

1.2462X10-3 

0.3700 

0.8086 

0.8927 
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AND 


10  2 

10  3 

10~“ 

10  5 

3.4640X10-2 

9.2359X10"3 

1.2608X10-3 

1.4929X10-4 

5.6927X10'3 

8.4112X10-4 

1.0646X10-4 

1.3724X10-5 

4.0118xl0~4 

6.1526X10-5 

1.2744X10-4 

0.2941 

0.7902 


AND 


K 

A 

10-2 

10"3 

T 

o 

T— < 

10  5 

10”* 

10~2 

3.4637X10'2 

9.23.52X10’3 

1.2607X10"3 

1.4928x10”“ 

1.7088X10”5 

10'1 

5.6922X10"3 

8.4106X10“^ 

1.0645x10““ 

1.2795X10”5 

1.5862x10”* 

1 

4.0115X10-4 

6.1521X10-5 

8.2904x10”* 

1.2966X1 O'5 

0.2336 

Table  A.12.  Pt((A.K.  cr2)  :(A.K.  cr2))  FOR  (  A  .  £  .  a;2)  €  D  AND 

(A  ,  if .  cr2)  =  (  1.10-2.  1) 


£ 

A 

10  2 

10  3 

10~* 

10  5 

iO-6 

10-2 

0.5918 

0.3365 

6.2612X10-2 

8.0124X10-2 

0.3063 

10"1 

0.2370 

4.2532X10'2 

0.1210 

0.3207 

0.3620 

1 

2.5268X10-2 

0.1915 

0.3366 

0.3638 

0.3674 

Table  A.13.  Pt((  A  .K.a?) ;(  A  .iT.cr2))  FOR  (A.K.  cr2)  6  12  AND 

(A  .K .  <r2)  =  (  1. 10-3. 1) 


A 

10  2 

10~3 

10~* 

10  5 

1 0-6 

10~2 

0.5915 

0.3361 

6.2539X10-2 

7.7522X10-3 

5.9115X10-2 

10"1 

0.2367 

4.2445X10-2 

5.5414X10'3 

9.3193X10'2 

0.3132 

1 

2.0615X10”2 

3.6880X10-3 

0.1509 

0.3288 

0.3629 

Table  A.14.  F€  ((  A  ,  .£ .  ct2)  ;  (  A  .  .  <r2))  FOR  (A,^,o;2)6Q  AND 

(A  ,  2T ,  cr2)  =  (  1, 10^.  1) 


K 

A 

10"2 

10  3 

»-*■ 

o 

1 

10  5 

io-* 

10'2 

0.5914 

0.3361 

6.2531X10-2 

7.7513X10-3 

8. 9218x10^ 

10_1 

0.2367 

4.2440X10-2 

5.5370X10-3 

6.6852X10-4 

7.3654X10-2 

1 

2.0613X10"2 

3.2060X10-3 

4.8144X10-4 

0.1196 

0.3213 
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Table  A.15.  Pe  ((  A  .  K .  g?) ;  (  A  .  K .  <r2))  FOR  {A.K.  cr2)  €  11  AND 

(A  ,i:.o-2)  =  (  1, 10_s.  1) 


K 

A 

10"2 

10  3 

10-4 

10  5 

10-6 

1 

O 

t— 4 

0.5914 

0.3361 

6.253 1X10-2 

7.7512X10-3 

8.9216X10-4 

H 

1 

O 

r-4 

0.2367 

4.2440X10-2 

5.5369X10-3 

6.6813X10*4 

7.8054X10-5 

1 

2.0613X10-2 

3.2060X10-3 

4.3299X10-4 

5.9325X10-5 

9.4999X10-2 

Table  A.16.  Pe  ((  A  .  K .  a?)  ;  (  A  .  K .  cr2))  FOR  (  A  .^.a2)  €  ft  AND 

(A  .  isT .  cr2)  =  C  1 , 10-6. 1) 


K 

A 

M 

1 

o 

1—4 

ro 

1 

o 

H 

T 

o 

10  5 

10”* 

10"2 

0.5914 

0.3361 

6.2531X10-2 

7.7512X10"3 

8.9216X10' 4 

10"1 

0.2367 

4.2440X1 0-2 

5.5369X10-3 

6.6813X10”4 

7.8017x10  s 

1 

2.0613X10-2 

3.2060X10-3 

4.3299X10"4 

5.4477X10-5 

7.0491X10-* 
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APPENDIX  B.  EXPRESSIONS  FOR  THE  COEFFICIENTS  OF  EQ.  (5.28) 


Let  a  and  ai}  (£  =  1 . n  .j  =  1.2....)  be  defined  as  in  Section  5.3.  Futhermore.  let 


and 


01 A  Z  (l+2^an  . 


MZa+*.a)Z-rT  ■ 

i=l  )=2-t  * 

ft 

03  A  L  1  . 


»=1 


ft  OO 

04  A  I  Lv 


»=1  J  =2 


au 

O-i )2  ‘ 


Then. 


ft  OO 

03  A  £ 


i  =1  J  =2 


<0 

0-D 


c  t  =  /342  n  2  —  02  04  n  (03  +  n  )  +  04  n  (05  +  n  )2  . 

c2  =  (_02  04n2~0203  04n  ~0204an) 

+  (  2  /3 3  04  n  —  03  04  n  +  022  n  +  2  04  a n  )(05  +  n  ) 

(  02  n  )(05  "h  )2  . 

c3  =  (04n3  +  2  0304n2  —  0X  04n2  i-  2  04  a  n2) 

+  (/3j  04  n  +  0  2  03  n  -  0X  03  04  n  -  0X  04  a  n  +  2  03  04  a  n  +  04  a2  n  ) 

+  (  —  02 n2  —  3  02  03 n.  +20j02n  —  02an)(05  +  n) 

+  (n2  +  03  n  —  0X  n  )(05  +  n  )2 

c4  =  (  2  02  03  n  —  2  02  03  ft  +  2  0!  02  03  n  _  2  02  03  ®  H.  ) 

+  (  2  03  n  2  —  03  n  2  —  3  0X  03  n  +2  0  2  n  +  0  2  n  —  0t  a  n  +2  03  a  n  )(05  +  n  ) 


and 


c5  =  (03n3-20103n2  +  2  03  n2  +  203an2) 

+  (  02  03 n  —2  0X  032n  +  033 n  —  2  0X  03  an  +  2  02  an  +  03 a 2  n  )  . 
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